Skip to main content
AI Foundations

⏱ About 20 min20 XP

Prompting and Context

You now know what LLMs are, how they work, what they can do, and what they cannot. This lesson is practical: how do you actually work effectively with an LLM? The answer involves understanding the context window, appreciating why prompt phrasing matters, and developing systematic practices for eliciting reliable, useful outputs. Prompting is not magic — it follows directly from the underlying mechanics. Once you understand why prompt design matters, you can reason about it rather than memorize a list of tricks.

The Context Window: Your Working Space

Everything an LLM knows during inference — the instructions you gave it, the conversation so far, any documents you pasted in, and its own previous outputs — exists in the context window. The context window is measured in tokens and has a maximum size. As of 2025, leading models offer context windows of 128,000 to over 1,000,000 tokens — enough for hundreds of pages of text. The context window is the model's entire working memory. It has no other access to information. If you want the model to consider a document, paste it in. If you want it to remember a constraint you stated earlier, that constraint must still be in the window. If a conversation grows very long and approaches the window limit, earlier context may be truncated or summarized, and the model may lose track of instructions given much earlier. Practical implications: - Front-load important instructions. Place the most important constraints, roles, and requirements at the beginning of your prompt, not buried in the middle. - Include all necessary context. The model cannot look up information you did not provide. If it needs a document, a piece of code, or a set of facts to work with, include them explicitly. - Be aware of window limits. For very long tasks, break them into chunks rather than attempting to do everything in one enormous prompt.

The Context Window as Working Memory

An LLM has no memory outside its context window. The context window contains everything: your instructions, the conversation history, any pasted documents, and the model's previous responses. Everything outside the window is invisible to the model. Understanding this explains many surprising LLM behaviors, including forgetting earlier instructions in long conversations.

Why prompt phrasing matters — mechanically: An LLM predicts the next token based on all preceding tokens. Your prompt is the initial sequence of tokens that sets the statistical context for everything that follows. Different phrasings activate different patterns in the model's learned distributions. Example: Compare these prompts for the same task: Prompt A: 'Explain quantum entanglement.' Prompt B: 'You are a physics teacher explaining quantum entanglement to a 10th-grade student who understands basic atomic structure but has not studied quantum mechanics. Use an analogy, define key terms, and check for understanding at the end.' Both ask for an explanation of quantum entanglement. Prompt B will almost certainly produce a more useful result — not because the model tries harder, but because Prompt B's token sequence activates patterns associated with pedagogically structured explanations at a specific level, while Prompt A activates a much broader and less constrained distribution. Key prompting strategies that follow from the mechanics: Specify role and audience: 'You are a...' and 'for a reader who...' narrow the statistical distribution toward appropriate content. Specify format: 'Respond in bullet points,' 'Write no more than three paragraphs,' 'Include a brief example for each point' — format specifications activate patterns from training data that match those constraints. Give examples (few-shot prompting): Providing one to five examples of the desired input-output format before your actual query is one of the most powerful prompting techniques. The model infers the pattern from the examples and applies it. This is called few-shot prompting. Ask the model to reason step by step: Prompting a model to 'think step by step' or 'show your work' before giving an answer consistently improves accuracy on reasoning tasks. This is called chain-of-thought prompting. It works because explicitly generating intermediate reasoning steps gives the model more useful token context to draw from when producing the final answer. Iterate: Prompting is not a one-shot process. Use the model's output to refine your prompt. If the response is too long, say so. If it missed a key consideration, point it out and ask for a revised version.

Limitations of Prompting

Prompting is powerful, but it operates within the model's capabilities. Better prompts cannot give the model knowledge it does not have, fix hallucination on topics the model is unreliable on, or produce correct arithmetic in a model that is bad at arithmetic. Prompting works with the model's existing capabilities — it does not create new ones. Instruction following is not guaranteed: Models sometimes ignore specific instructions, especially in complex prompts with many constraints. The instruction is one set of tokens in the context; other patterns in the training distribution may be stronger. This is why testing your prompt on representative inputs — before using it in a critical application — is important. Adversarial prompting: Because LLMs follow the statistical patterns of their training, carefully crafted prompts can sometimes elicit behaviors the model was aligned against — a phenomenon called prompt injection or jailbreaking. This is an active area of safety research. Alignment is not fully robust to adversarial prompting. Over-reliance risk: A well-prompted LLM that produces fluent, confident, well-structured output can be mistaken for reliable output. The quality of the writing is not a signal of the accuracy of the content. This is perhaps the most practically important caveat about prompting: getting better output more easily makes it easier to over-trust.

Prompt Challenge

Write a prompt asking an LLM to explain a complex topic to someone with no background in it.

Your prompt should…

  • Specify the topic you want explained clearly
  • Tell the model what background knowledge the reader has
  • Ask the model to include a concrete example or analogy
Chain-of-Thought Is Not Optional for Hard Problems

On any task requiring multi-step reasoning — math problems, logical deductions, complex comparisons — always ask the model to show its reasoning step by step before giving a final answer. This measurably improves accuracy and makes errors easier to spot and correct.

A user pastes a 50-page contract into an LLM and asks it a specific question. Halfway through the conversation, the user asks a follow-up about a clause mentioned at the beginning of the contract. The model seems to have forgotten it. The most likely explanation is:

Chain-of-thought prompting improves LLM accuracy on reasoning tasks. According to the mechanistic explanation in this lesson, why?

Prompt Engineering Lab

  1. You will systematically explore how prompt changes affect LLM output quality.
  2. Task: Ask an LLM (or predict the output if you lack access) to solve this problem:
  3. 'A store is selling apples for $1.20 each and oranges for $0.85 each. Maya buys 7 apples and 4 oranges. How much does she spend in total?'
  4. Round 1: Submit the problem exactly as written with no additional instructions. Record (or predict) the output.
  5. Round 2: Add to the front of the prompt: 'Solve this step by step. Show each arithmetic operation separately. State your final answer clearly.'
  6. Round 3: Add to Round 2's prompt: 'After solving, check your answer by multiplying back: verify that your apple total divided by 7 equals $1.20 and your orange total divided by 4 equals $0.85.'
  7. Compare the three outputs. Answer these questions:
  8. - Did the additional instructions change the structure of the response?
  9. - Did they change the accuracy?
  10. - What does this tell you about when step-by-step prompting is most important?
  11. - Design one more variation of your prompt that you predict would further improve reliability. Explain your reasoning.