AI Wiki

A small unit of text that a model reads or writes.

Tokens can be words, word pieces, punctuation, or characters. Context length, pricing, and output limits are usually measured in tokens.

Context Window

The amount of information a model can consider at once.

A larger context window lets a model process longer documents, but it does not guarantee perfect memory. Structure still matters.

RAG

Retrieval-augmented generation: search first, answer second.

RAG retrieves relevant documents before generating an answer. Good RAG requires ingestion, chunking, embeddings, retrieval, reranking, citations, permissions, and evals.

Embedding

A numeric representation of meaning.

Embeddings map content into vectors so similar items are near each other. They support semantic search, recommendations, clustering, and deduplication.

Agent

An AI system that plans, uses tools, and checks progress.

Agents combine a model with tools, state, memory, permissions, and control flow. Good agents have clear goals, logs, approvals, and stopping rules.

MCP

A protocol for connecting AI assistants to external tools and data.

Model Context Protocol standardizes how assistants discover and call tools such as files, databases, calendars, browsers, and SaaS systems.

Eval

A test that measures whether an AI workflow is working.

Evals include common cases, edge cases, expected answers, scoring criteria, and regression checks. They are essential for production AI.

Fine-tuning

Training a model further on task-specific data.

Fine-tuning can improve format, style, or domain behavior, but retrieval is better when answers require fresh or private facts.

Prompt Injection

An instruction that tries to override the system goal.

Prompt injection is risky when models read untrusted content and have tool access. Mitigations include instruction hierarchy, allowlists, sandboxing, and approvals.

Control Image

A reference image used to guide generation.

Sketches, poses, depth maps, masks, or product references control visual generation more reliably than text prompts alone.

Multimodal Model

A model that can process more than one type of input or output.

Multimodal models may read text, images, audio, video, or files and can enable UI review, transcription, creative workflows, and visual QA.

Tool Calling

Letting a model request a specific external action.

Tool calling turns model output into structured actions such as search, database queries, file edits, API calls, or calendar updates.

Reasoning Model

A model optimized for harder multi-step problems.

Reasoning models spend more compute on planning and analysis. They are often slower and more expensive, but useful for code, math, and complex decisions.

Small Language Model

A compact model optimized for cost, speed, or local use.

Small models can be excellent for classification, extraction, routing, and private deployment when the task is narrow and well evaluated.

Vector Database

A database optimized for similarity search over embeddings.

Vector databases help retrieve semantically similar chunks, products, documents, or user memories. Examples include Pinecone, Weaviate, Milvus, Qdrant, and pgvector.

Chunking

Splitting documents into retrievable pieces.

Chunking affects RAG quality. Chunks should preserve meaning, fit model context, include metadata, and avoid splitting important tables or procedures badly.

Reranking

Reordering retrieved results before generation.

Rerankers improve retrieval quality by scoring candidate passages against the query, often reducing irrelevant context in RAG systems.

Knowledge Graph

A network of entities and relationships.

Knowledge graphs can improve AI systems that need explicit relationships, traceable facts, and structured reasoning over people, products, events, or policies.

Guardrail

A rule or system that limits risky AI behavior.

Guardrails can check inputs, outputs, tool calls, data access, formatting, policy compliance, and human approval requirements.

PII

Personally identifiable information.

PII includes data that identifies a person, such as names, IDs, emails, phone numbers, addresses, financial records, and sometimes combinations of indirect signals.

Data Residency

Rules about where data may be stored or processed.

Data residency matters for privacy, compliance, and enterprise procurement. It affects model hosting, logging, backups, and vendor selection.

Human-in-the-Loop

A workflow where humans review or approve AI actions.

Human review is important for external communications, money movement, destructive actions, regulated advice, and uncertain model outputs.

System Prompt

The higher-priority instruction that defines assistant behavior.

System prompts set role, boundaries, policies, and workflow rules. They should be concise, testable, and protected from user or document override.

Few-Shot Prompting

Teaching a pattern through examples.

Few-shot examples improve consistency for classification, extraction, rewriting, and formatting tasks. Include edge cases and counterexamples where possible.

Chain-of-Thought

A reasoning style where intermediate steps are considered.

Modern systems often ask for concise reasoning summaries rather than hidden detailed reasoning. The practical goal is better checks, not exposing every internal step.

Structured Output

Model output constrained to a schema or format.

Structured output is useful for APIs, extraction, automations, and evals. JSON schema, Pydantic, and validation layers make outputs more reliable.

Inpainting

Editing a selected region of an image.

Inpainting lets users replace, repair, or extend parts of an image while preserving the rest. Good masks and preservation instructions matter.

Upscaling

Increasing image or video resolution.

AI upscaling improves perceived detail but can invent artifacts. It should be checked carefully for product images, faces, text, and brand assets.

LoRA

A lightweight fine-tuning method for style or subject control.

LoRA adapters are common in image generation and open-model workflows. They can capture a character, product, visual style, or domain behavior.

Storyboard

A shot-by-shot plan for video.

AI video benefits from storyboards because continuity, camera motion, character consistency, and shot duration are difficult to solve with a single prompt.

Inference

Running a model to produce an output.

Inference cost and latency depend on model size, hardware, batching, context length, output length, and serving architecture.

Quantization

Reducing model precision to save memory and speed inference.

Quantization enables larger models on smaller hardware, but may reduce quality. Test the target task rather than relying only on benchmarks.

Model Routing

Sending tasks to different models based on need.

Routing can use fast cheap models for easy work and reasoning models for hard work. It improves cost control and reliability.

Observability

Tracking what an AI system does in production.

AI observability covers prompts, retrieved context, tool calls, latency, cost, errors, model versions, user feedback, and eval results.

Latency

How long a request takes to complete.

Latency matters for user experience. Streaming, caching, shorter prompts, smaller models, and better routing can reduce perceived wait time.

AI Copilot

An assistant that helps a human complete work.

Copilots support writing, analysis, search, coding, and operations while keeping the human in control of decisions and final output.

AI Agent

A system that performs a workflow with more autonomy.

Agents differ from copilots because they can plan, call tools, track state, and continue across steps. They need stricter permissions and review.

AI Workflow

A repeatable process that includes AI at one or more steps.

Good AI workflows define inputs, outputs, approvals, quality checks, fallback paths, and ownership. They are more valuable than one-off prompts.

Prompt Library

A reusable collection of tested prompts.

A prompt library should include examples, owners, expected outputs, version history, and notes on when the prompt should not be used.

AI Readiness

How prepared a team is to adopt AI responsibly.

Readiness includes data quality, security posture, process clarity, leadership support, training, evaluation, and change management.