AI Vision
Skills, tools, briefs, wiki
HomeBriefsSkillsToolsWikiWeekly

AI Wiki

Plain-English definitions for practical AI work.

Use this page when a brief, tool, or skill mentions an unfamiliar concept. Each entry starts with a short definition and expands into a practical explanation.

Reference books and notes

Token

Models

A small unit of text that a model reads or writes.

Tokens can be words, word pieces, punctuation, or characters. Context length, pricing, and output limits are usually measured in tokens.

Read entry

Context Window

Models

The amount of information a model can consider at once.

A larger context window lets a model process longer documents, but it does not guarantee perfect memory. Structure still matters.

Read entry

RAG

Systems

Retrieval-augmented generation: search first, answer second.

RAG retrieves relevant documents before generating an answer. Good RAG requires ingestion, chunking, embeddings, retrieval, reranking, citations, permissions, and evals.

Read entry

Embedding

Data

A numeric representation of meaning.

Embeddings map content into vectors so similar items are near each other. They support semantic search, recommendations, clustering, and deduplication.

Read entry

Agent

Systems

An AI system that plans, uses tools, and checks progress.

Agents combine a model with tools, state, memory, permissions, and control flow. Good agents have clear goals, logs, approvals, and stopping rules.

Read entry

MCP

Systems

A protocol for connecting AI assistants to external tools and data.

Model Context Protocol standardizes how assistants discover and call tools such as files, databases, calendars, browsers, and SaaS systems.

Read entry

Eval

Safety

A test that measures whether an AI workflow is working.

Evals include common cases, edge cases, expected answers, scoring criteria, and regression checks. They are essential for production AI.

Read entry

Fine-tuning

Models

Training a model further on task-specific data.

Fine-tuning can improve format, style, or domain behavior, but retrieval is better when answers require fresh or private facts.

Read entry

Prompt Injection

Safety

An instruction that tries to override the system goal.

Prompt injection is risky when models read untrusted content and have tool access. Mitigations include instruction hierarchy, allowlists, sandboxing, and approvals.

Read entry

Control Image

Creative

A reference image used to guide generation.

Sketches, poses, depth maps, masks, or product references control visual generation more reliably than text prompts alone.

Read entry

Multimodal Model

Models

A model that can process more than one type of input or output.

Multimodal models may read text, images, audio, video, or files and can enable UI review, transcription, creative workflows, and visual QA.

Read entry

Tool Calling

Systems

Letting a model request a specific external action.

Tool calling turns model output into structured actions such as search, database queries, file edits, API calls, or calendar updates.

Read entry

Reasoning Model

Models

A model optimized for harder multi-step problems.

Reasoning models spend more compute on planning and analysis. They are often slower and more expensive, but useful for code, math, and complex decisions.

Read entry

Small Language Model

Models

A compact model optimized for cost, speed, or local use.

Small models can be excellent for classification, extraction, routing, and private deployment when the task is narrow and well evaluated.

Read entry

Vector Database

Data

A database optimized for similarity search over embeddings.

Vector databases help retrieve semantically similar chunks, products, documents, or user memories. Examples include Pinecone, Weaviate, Milvus, Qdrant, and pgvector.

Read entry

Chunking

Data

Splitting documents into retrievable pieces.

Chunking affects RAG quality. Chunks should preserve meaning, fit model context, include metadata, and avoid splitting important tables or procedures badly.

Read entry

Reranking

Data

Reordering retrieved results before generation.

Rerankers improve retrieval quality by scoring candidate passages against the query, often reducing irrelevant context in RAG systems.

Read entry

Knowledge Graph

Data

A network of entities and relationships.

Knowledge graphs can improve AI systems that need explicit relationships, traceable facts, and structured reasoning over people, products, events, or policies.

Read entry

Guardrail

Safety

A rule or system that limits risky AI behavior.

Guardrails can check inputs, outputs, tool calls, data access, formatting, policy compliance, and human approval requirements.

Read entry

PII

Safety

Personally identifiable information.

PII includes data that identifies a person, such as names, IDs, emails, phone numbers, addresses, financial records, and sometimes combinations of indirect signals.

Read entry

Data Residency

Safety

Rules about where data may be stored or processed.

Data residency matters for privacy, compliance, and enterprise procurement. It affects model hosting, logging, backups, and vendor selection.

Read entry

Human-in-the-Loop

Safety

A workflow where humans review or approve AI actions.

Human review is important for external communications, money movement, destructive actions, regulated advice, and uncertain model outputs.

Read entry

System Prompt

Prompting

The higher-priority instruction that defines assistant behavior.

System prompts set role, boundaries, policies, and workflow rules. They should be concise, testable, and protected from user or document override.

Read entry

Few-Shot Prompting

Prompting

Teaching a pattern through examples.

Few-shot examples improve consistency for classification, extraction, rewriting, and formatting tasks. Include edge cases and counterexamples where possible.

Read entry

Chain-of-Thought

Prompting

A reasoning style where intermediate steps are considered.

Modern systems often ask for concise reasoning summaries rather than hidden detailed reasoning. The practical goal is better checks, not exposing every internal step.

Read entry

Structured Output

Prompting

Model output constrained to a schema or format.

Structured output is useful for APIs, extraction, automations, and evals. JSON schema, Pydantic, and validation layers make outputs more reliable.

Read entry

Inpainting

Creative

Editing a selected region of an image.

Inpainting lets users replace, repair, or extend parts of an image while preserving the rest. Good masks and preservation instructions matter.

Read entry

Upscaling

Creative

Increasing image or video resolution.

AI upscaling improves perceived detail but can invent artifacts. It should be checked carefully for product images, faces, text, and brand assets.

Read entry

LoRA

Creative

A lightweight fine-tuning method for style or subject control.

LoRA adapters are common in image generation and open-model workflows. They can capture a character, product, visual style, or domain behavior.

Read entry

Storyboard

Creative

A shot-by-shot plan for video.

AI video benefits from storyboards because continuity, camera motion, character consistency, and shot duration are difficult to solve with a single prompt.

Read entry

Inference

Deployment

Running a model to produce an output.

Inference cost and latency depend on model size, hardware, batching, context length, output length, and serving architecture.

Read entry

Quantization

Deployment

Reducing model precision to save memory and speed inference.

Quantization enables larger models on smaller hardware, but may reduce quality. Test the target task rather than relying only on benchmarks.

Read entry

Model Routing

Deployment

Sending tasks to different models based on need.

Routing can use fast cheap models for easy work and reasoning models for hard work. It improves cost control and reliability.

Read entry

Observability

Deployment

Tracking what an AI system does in production.

AI observability covers prompts, retrieved context, tool calls, latency, cost, errors, model versions, user feedback, and eval results.

Read entry

Latency

Deployment

How long a request takes to complete.

Latency matters for user experience. Streaming, caching, shorter prompts, smaller models, and better routing can reduce perceived wait time.

Read entry

AI Copilot

Business

An assistant that helps a human complete work.

Copilots support writing, analysis, search, coding, and operations while keeping the human in control of decisions and final output.

Read entry

AI Agent

Business

A system that performs a workflow with more autonomy.

Agents differ from copilots because they can plan, call tools, track state, and continue across steps. They need stricter permissions and review.

Read entry

AI Workflow

Business

A repeatable process that includes AI at one or more steps.

Good AI workflows define inputs, outputs, approvals, quality checks, fallback paths, and ownership. They are more valuable than one-off prompts.

Read entry

Prompt Library

Business

A reusable collection of tested prompts.

A prompt library should include examples, owners, expected outputs, version history, and notes on when the prompt should not be used.

Read entry

AI Readiness

Business

How prepared a team is to adopt AI responsibly.

Readiness includes data quality, security posture, process clarity, leadership support, training, evaluation, and change management.

Read entry