The trained AI program that generates text, code, or answers. It is not a database of stored answers: it is a network of weights that, given an input, computes the most probable output.
Dictionary
The vocabulary of AI coding, in plain English.
Skimmable definitions of the terms that come up constantly when you code with agents. Search the entries below, or jump into a section.
63 entries
The Model
The model's internal numbers (the "weights") tuned during training. When someone says a model is "70B", they mean 70 billion of these parameters.
The smallest unit a model reads and writes: chunks of words, not letters or whole words. "Programming" might be 3 or 4 tokens. Everything is measured and billed in tokens.
The process of slicing your text into tokens before the model processes it. It explains why the model sometimes miscounts letters: it doesn't see letters, it sees tokens.
Running an already-trained model to produce an answer. Every time you send a prompt and wait for output, that's an inference.
The underlying mechanism: the model writes one token at a time, choosing each based on everything before it. The whole answer is built token by token, not all at once.
A setting that controls how much risk the model takes when picking each token. Low = more predictable, repeatable answers; high = more creative and varied ones.
The same prompt can produce different answers across two runs. It's expected: the model picks among probable options, so don't count on identical results every time.
The latest date the model's training data goes up to. It knows nothing that happened after, unless you tell it in the prompt or give it a search tool.
A mode where the model "thinks out loud" before answering, generating intermediate steps. It improves hard tasks at the cost of more tokens and more time.
A model that understands more than text: images, screenshots, sometimes audio. Handy for pasting a screenshot of an error or a design and having it read it.
Sessions, Context & Turns
One continuous conversation with the agent, from when you open it to when you close it. Everything said in the session lives in its context and keeps piling up.
The maximum number of tokens the model can have "in view" in a single inference: your prompt, the history, and the files it read. Whatever doesn't fit, the model can't see.
All the information the model currently has loaded to answer: instructions, conversation, files, and tool results. It's its working memory for the moment.
One round trip: you speak, the agent acts and responds. A complex task gets solved over several turns, not just one.
The message you send the model. A good prompt is specific, gives context, and states the result you expect; a vague prompt gives vague results.
The background instructions that define how the agent behaves: its role, its rules, and its tone. The user doesn't always see it, but it shapes every answer.
When the conversation nears the window's limit, the agent summarizes the old parts to make room. You gain space, but lose detail: anything important should be re-anchored.
When a long session fills up with noise and the agent starts losing the thread or contradicting itself. The fix is usually to start a fresh session with a good summary.
The quality drop when the window is too full: even if everything fits, the model pays less attention to each part. More context isn't always better.
Treating context as a limited resource: every file, every tool output, and every message spends from the same pocket. Managing it well keeps the agent sharp.
Tools & Environment
An action the agent can run beyond writing text: read a file, run a command, search the web. Tools are what turn a chat into an agent.
The moment the model decides to use a tool and builds the request with its arguments. The system runs it and hands the result back so it can continue.
An open standard for connecting agents to external tools and data. Instead of hand-coding each integration, an MCP server exposes capabilities the agent can use.
A process that offers tools and resources over the MCP protocol: GitHub, a database, your filesystem. You plug it into the agent and it becomes available.
The command-line interface: the agent lives in your terminal and operates your project straight from there. It's the natural home of tools like Claude Code.
An isolated environment where the agent can run commands without touching your real system. It gives you the safety to let it work without asking permission at every step.
The rules that decide what the agent can do without asking. A good allowlist lets it run routine things on its own and checks with you on the risky ones.
An extra working copy of the same git repo, in another folder and on another branch. It lets several agents work in parallel without stepping on each other's files.
An agent the main agent spawns for a focused task, with its own context. It returns just the conclusion, so the parent agent doesn't fill up with noise.
A bundle of instructions and files that teaches the agent to do a specific thing (publish a reel, generate audio). It loads only when needed.
Failure Modes
When the model confidently makes something up: a function that doesn't exist, a fake citation, a wrong fact. It sounds convincing but isn't real; always worth verifying.
The model's tendency to state things assertively even when unsure. It rarely says "I don't know", which is why you should ask it to show evidence.
When the model agrees with you just to please you, instead of pointing out the mistake. If you want honest critique, ask for it explicitly.
When the agent gets stuck repeating the same failing action over and over. It often happens hitting an error it doesn't understand; best to stop it and redirect.
When the agent adds more than you asked for: abstractions, options, and files nobody needed. It does the task, but bloats the change and the risk.
When the agent replaces real code with a comment like "// rest unchanged" and deletes things that mattered. Always review the diff before accepting.
An attack where external content (a page, an issue, a file) carries hidden instructions the agent obeys by accident. That's why you can't blindly trust what it reads.
In long sessions the agent stops following a rule you gave early on, because it got buried under everything new. Restate the critical bits, or put them in a rules file.
Handoffs
Passing work from one agent (or session) to another without losing the thread. The key is a good state summary: what's done, what's left, and why.
A short note capturing where the task stands: decisions made, files touched, and next steps. It's what lets you pick up again without re-explaining everything.
The document defining what to build before any code is written. A good spec is the best handoff there is: the agent works against it without guessing.
Coordinating several agents or steps toward a goal: who does what, in what order, and how the results come together. An "orchestrator" agent delegates and consolidates.
A saved point of the work (a commit, a note, a summary) you can return to if something goes wrong. Frequent checkpoints are the safety net of agent work.
Undoing the agent's changes and returning to the last good state. With clean git and small commits, a rollback is cheap; without it, it hurts.
Memory & Steering
Information the agent remembers across sessions, not just within one. Unlike context (which clears when you close), memory persists and gets reused.
The project instruction file the agent reads on startup: conventions, rules, and where everything lives. It's the most direct way to steer it in a stable way.
Guiding the agent while it works, without rewriting everything: course-correcting, narrowing scope, setting priorities. Good steering saves a lot of back-and-forth.
Commands that fire automatically at key moments of the agent's work (before a commit, when it finishes). They automate rules the agent shouldn't have to remember.
A short, permanent instruction you want the agent to always respect. When a mistake repeats, writing a rule keeps it from happening again.
Showing the agent one or two examples of the result you want, instead of only describing it. Often a good example steers better than a paragraph of instructions.
Tying the agent's answers to real sources (your code, your docs) instead of what it "believes". Good grounding cuts hallucinations and makes the result verifiable.
Retrieving relevant chunks of your documents and feeding them into context before answering. It gives the model fresh, specific information it didn't have from training.
Patterns of Work
A mode where the agent first proposes a plan and waits for your approval before touching anything. Ideal for big tasks: you align on direction before spending work.
Letting the agent run everything without asking permission at each step. Blazing fast and risky at once: only use it in a sandbox or with clean git so you can revert.
Coding by describing what you want in natural language and letting the agent write the code. Great for prototypes; for production you must review and understand the output.
Writing the tests first and letting the agent code until they pass. The tests give it a verifiable goal and curb gold-plating.
Having the agent check its own work: run the app, the tests, or the build and fix what broke. Without verification, "done" doesn't mean it works.
Keeping review points where a person approves before continuing. Reserve human control for what's irreversible or expensive, and automate the rest.
An agent that runs on its own, in parallel, and pings you when it's done. It frees you to do something else while it handles a long task.
Launching several agents at once on independent parts of a problem and merging the results. It speeds things up when the pieces don't depend on each other.
What's in and what's out of a task. Defining a small, clear scope is the best way to get a good result from an agent: do this, no more than this.
Reading exactly what changed before accepting it, line by line. It's the habit that most separates agent work that holds up from the kind that blows up later.