Skip to main content
AI Agents & Automation

⏱ About 20 min20 XP

Module Check: Context Is Everything

A language model is stateless. It reads what is in front of it, generates output, and retains nothing. Every apparent memory in an agent — every sign that the system knows what happened before — is the result of deliberate engineering: information stored, retrieved, formatted, and injected into the context. This module has traced that engineering from first principles to full architecture. Before moving on, let us consolidate every major concept.

Flashcards — click each card to reveal the answer

Module Quiz

A language model is described as stateless. What does this mean precisely?

An agent's context window is 128,000 tokens. The system prompt uses 3,000 tokens, the conversation history uses 70,000 tokens, and a retrieved document is 60,000 tokens. What happens when the agent tries to send this prompt?

In a RAG pipeline, why are documents split into small chunks before embedding, rather than embedding entire documents?

An agent has been running for 200 turns. In turn 1, the user said the output must be in Spanish. In turn 150, the agent switched to English. The rolling window is set to 100 turns. What is the most precise diagnosis?

An agent claims it already tried approach X in step 3 and it failed. The engineering logs show step 3 never ran — the agent was restarted after step 2 and resumed at step 4. What failure mode is this?

Which of the following is the most complete description of a production-quality agent memory architecture?

Synthesis: The Full Picture

The arc of this module can be traced in a single thought experiment. Start with a completely stateless language model — no memory at all. Task it with helping a student study for a standardized exam over three months. Every limitation you encounter — the model does not know the student's weak areas, forgets what was covered last session, cannot access the textbook, loses track of the study plan, contradicts earlier advice — points to a specific memory need. Each need maps to a specific engineering solution: a database for structured records, a vector store for the textbook corpus, a session summary for recent history, pinned constraints for persistent requirements, a state object for the current plan. Working through that thought experiment, and engineering solutions for each gap, is exactly what a real agent engineer does. Memory is not a feature added to an agent — it is what separates a functional agent from a stateless shell.

From Stateless to Memory-Complete

  1. This final activity synthesizes the entire module.
  2. Setup: Imagine a completely stateless AI assistant — no memory of any kind. Each API call is independent. It cannot remember anything.
  3. Step 1: List five specific, concrete things this stateless assistant cannot do that a real user would expect it to do. Be specific about the failure (not 'it forgets things' — give a precise scenario like 'if a user mentions their name in message 1, the assistant cannot use it in message 10').
  4. Step 2: For each of your five failures, name the specific memory mechanism that would fix it (context window replay, long-term database, RAG, state object, summarization, or pinned message).
  5. Step 3: Now build the complete memory stack. You have all five mechanisms available. Describe, in one paragraph, the full memory architecture of your ideal general-purpose AI assistant. Name every store, describe when it is written, and describe when it is read.
  6. Step 4: What is the hardest memory problem that is NOT solved by any of the mechanisms in this module? Where does memory engineering still fall short? Write 3-4 sentences describing the unsolved problem.
  7. Step 5: Compare your answer to Step 4 with a classmate. What did they identify that you missed? What does this suggest about the current limitations of AI agents?