Reference
AI coding glossary
A plain-English glossary of AI coding terminology — context window, RAG, MCP, tool use, agentic loop, embedding, and every other term you will run into in 2026. Definitions you can actually use.
Reference · 30 terms
A plain-English glossary of the terms you will actually run into while building with AI in 2026. Skimmable, bookmarkable, linkable. If a term is missing, it probably is not worth knowing yet.
- Agent
- An AI model equipped with tools and a loop — it can read files, run commands, observe results, and decide what to do next. The difference between a chatbot and an agent is that an agent acts.
- Agentic loop
- The iterative cycle of observe → think → act → observe that an AI agent runs to complete a task. Modern coding agents loop until the task is done or a budget is exhausted.
- Chain of thought
- A prompting technique where the model is asked to reason step-by-step before answering. Often produces more reliable output on complex tasks.
- Completion
- The model's response to a prompt. In coding tools, an inline completion is a suggested continuation of the line you are typing.
- Context window
- The amount of text a model can hold in attention at once, measured in tokens. A larger context window lets the model reason over more code at the same time, but adding irrelevant content dilutes attention.
- Embedding
- A vector representation of a piece of text or code. Embeddings let systems compare meaning — used to retrieve the most relevant files or docs for a query.
- Fine-tuning
- Adjusting a pre-trained model's weights on new data. Rarely necessary for vibe coding in 2026 — prompting and context are almost always enough.
- Foundation model
- A large, general-purpose model trained on broad data that can be adapted to many tasks. GPT-5, Claude Opus 4.6, and Gemini 3 are foundation models.
- Grounding
- Giving the model authoritative source material (docs, code, data) so its output is anchored to facts instead of its prior.
- Hallucination
- When a model generates plausible-sounding but false information — an API that does not exist, a parameter with the wrong name. The main failure mode to guard against.
- Inference
- Running a trained model to generate output. Every prompt you send is an inference call.
- In-context learning
- The ability of large models to adapt to new tasks from examples in the prompt alone, without any training. Why few-shot prompting works.
- Latency
- Time from prompt to response. Matters enormously in interactive coding — a 3-second completion feels great, a 30-second one breaks flow.
- LLM
- Large language model. The class of models vibe coding relies on.
- MCP
- Model Context Protocol. An open standard from Anthropic for connecting AI assistants to external tools and data sources. Increasingly the common interface between models and the outside world.
- Multi-modal
- A model that can process more than one type of input — text, images, audio, video. Useful in coding for feeding screenshots, designs, or diagrams into a prompt.
- Prompt
- The input you send to a model. In coding, good prompts have intent, constraints, and relevant context.
- Prompt engineering
- The craft of structuring prompts to get reliable, useful output. Real — but often overstated. Clear thinking beats clever prompts.
- RAG
- Retrieval-augmented generation. A pattern where the system retrieves relevant documents or code first, then includes them in the prompt. How most AI IDEs answer questions about your codebase.
- Reasoning model
- A model trained or configured to spend extra inference time "thinking" before answering, producing better results on hard tasks at the cost of latency.
- Rules file
- A project-level file (
.cursorrules,AGENTS.md,CLAUDE.md) that encodes conventions the AI must follow on every prompt. The single highest-leverage piece of context you can set. - Sampling
- How the model picks the next token during generation. Controlled by parameters like temperature and top-p.
- System prompt
- A prompt that sets the model's persona and constraints for an entire session, before any user message. Used by tools to enforce defaults.
- Temperature
- A sampling parameter from 0 to 2 that controls randomness. 0 means deterministic, higher means more varied. For coding, low temperatures are almost always better.
- Token
- The unit a model reads and writes in. Roughly 3–4 characters in English. Pricing and context limits are measured in tokens.
- Tool use
- The ability of a model to call external functions — read a file, search the web, run a shell command. The feature that turns a language model into an agent.
- Vibe coding
- Building software primarily by collaborating with an AI model. See the full definition on our what is vibe coding page.
- Vector database
- A database optimized for embedding search. Used by coding tools to find the most relevant snippets for your query.
- WebContainer
- Browser-based runtime (from StackBlitz) that runs full Node.js stacks without a server. What powers Bolt and similar in-browser IDEs.
- Zero-shot / few-shot
- Zero-shot: asking the model to perform a task with no examples. Few-shot: including a handful of examples in the prompt. Few-shot prompting remains one of the most reliable accuracy wins.
Keep learning
If a term here sparked a question, the long-form guides cover the concepts in context: what is vibe coding, how to start, the workflow, and the tools.