AI Vision
Skills, tools, briefs, wiki
HomeBriefsSkillsToolsWikiWeekly
Back to wiki
Deployment

Latency

How long a request takes to complete.

Plain-English explanation

Latency matters for user experience. Streaming, caching, shorter prompts, smaller models, and better routing can reduce perceived wait time.

Why it matters

Latency matters because it affects how AI systems are designed, evaluated, priced, or trusted. Knowing the term helps you ask better questions and avoid vague implementation decisions.

  • Ask how it changes quality, cost, speed, or safety.
  • Look for concrete examples in the workflow you are building.
  • Document the tradeoff before choosing a tool or architecture.