Core Concepts
Tokens: Basic units of text that AI models process. A typical English word is 1-2 tokens. Numbers, punctuation, and spaces also count as tokens. This matters because AI models have limits on how many tokens they can process at once.
Context Window: The maximum amount of text (measured in tokens) that an AI model can consider at one time, including both your input and the model's output. Once you exceed the context window, the model starts "forgetting" earlier parts of the conversation.
Key Insight: Different AI tools have dramatically different context windows. ChatGPT handles ~128,000 tokens (~96,000 words or ~300 pages). Claude handles ~200,000 tokens (~150,000 words or ~500 pages).
Exercise: Test Context Window Limits
-
Test with a short document (2-3 pages)
Upload or paste a short research paper, article, or document into ChatGPT or Claude.
Summarize the main argument and key findings of this document in 3 paragraphs. Then identify any methodological limitations discussed by the author.Observe: The AI should handle this easily and reference specific sections accurately.
-
Test with a longer document (10-20 pages)
Upload a longer paper, book chapter, or report.
How does the argument in section 2 connect to the conclusions in the final section? Provide specific examples from both sections.Observe: Does the AI maintain accuracy across different sections? Can it connect ideas from the beginning and end?
-
Test with an extremely long document (50+ pages) in Claude
Upload a dissertation chapter, book, or comprehensive report to Claude.
Create a detailed outline of this document's structure, including all major sections and their key points. Then identify the 3 most important arguments made across the entire document.Observe: Even with a large context window, very long documents may lead to the AI emphasizing recent sections over earlier ones.
-
Advanced: Test conversation length limits
Have an extended conversation with multiple back-and-forth exchanges (10+ turns). Then ask the AI to reference something from your first message.
Understanding Context Window Trade-offs
When Context Window Size Matters
- Analyzing long documents: Research papers, books, dissertations, comprehensive reports
- Comparing multiple sources: Literature reviews requiring synthesis across many papers
- Extended conversations: Multi-turn dialogues where context from early exchanges matters
- Complex projects: Tasks requiring the AI to reference many different pieces of information
Example: Claude's 200K token window can handle an entire PhD dissertation (~500 pages) in one conversation, while ChatGPT's 128K window might require breaking it into chunks.
⚡ Pro Strategy: Chunking for Better Results
Even with large context windows, breaking complex analysis into focused questions often yields better results than asking the AI to process everything at once. Instead of "Analyze this 300-page book," try:
- "What are the main arguments in chapters 1-3?"
- "How does the author's methodology in chapter 4 address the limitations identified earlier?"
- "Synthesize the key findings from chapters 5-7 and explain how they support the thesis."
This approach works better because it directs the AI's attention to specific sections rather than trying to "see" the entire document at once.
Comparing Context Windows Across Tools
ChatGPT
128,000 tokens
~96,000 words
~300 pages
Best for: Most research papers, book chapters, typical academic documents
Claude
200,000 tokens
~150,000 words
~500 pages
Best for: Entire books, dissertations, large codebases, comprehensive literature reviews
⚠️ Important: Context window includes BOTH input and output. If you upload a 100,000 token document, the AI has only 28,000 tokens left (in ChatGPT) or 100,000 tokens left (in Claude) for its response and any follow-up conversation.
Reflection Questions
- What types of documents or tasks in your work would benefit from larger context windows?
- Have you ever had an AI "forget" earlier parts of a conversation? What was happening with the context window?
- When might it be better to break a task into smaller chunks rather than using a large context window?
- How would you explain to students why an AI might give different answers when you upload a document all at once vs. in sections?
Key Takeaways
- Tokens are the basic units AI models process—roughly 1.3 tokens per English word on average.
- Context windows limit how much text an AI can "see" at once, including both input and output.
- Different tools have different context window sizes—choose accordingly for your task.
- Larger context doesn't always mean better understanding—focused questions often work better.
Mark as Complete
Have you finished this activity?