Skip to main content

The RAG Atlas: A Visual Guide to Retrieval Patterns

Ten RAG architectures, visually mapped with interactive diagrams and a live simulator

12 min read
RAG Atlas interface preview

Select a pattern to inspect its retrieval flow. Hover nodes for details, then adjust controls to see trade-offs in real time.

Vanilla RAG

top-50User QueryEmbed QueryVector SearchTop-k ChunksLLM

Vanilla RAG

The baseline: embed, search, generate

What it is

A single-pass retrieval pattern that embeds the user's query, finds the nearest matching chunks in a vector index, and gives those chunks to the LLM as grounding context.

Best for

Internal knowledge bases with clean, chunked documents

Core tradeoff

Simplest to build and debug vs Brittle on ambiguous queries

Failure mode

Returns plausible-sounding but wrong chunks when queries are ambiguous or the corpus has near-duplicate contradictory passages.

Detailsexpand

Pros

  • + Simplest to build and debug
  • + Lowest latency of all RAG variants
  • + Single embedding model to manage

Cons

  • Brittle on ambiguous queries
  • No score calibration across chunks
  • Sensitive to chunk size and overlap

Latency: 50–150ms retrieval + LLM generation. Total: 500–2000ms.

Cost: 1× retrieval. ~$0.0001 per query at ada-002 pricing.

Live Simulator

Adjust settings to see trade-off effects

Directional model only: this shows relative behavior, not production benchmarks.

512 tokens
1282 048

Smaller chunks improve precision but can miss context; larger chunks add context but can add noise and cost.

5 chunks
120

Higher k increases recall but expands context and latency. Too high can dilute relevance.

Reranking usually lowers accuracy risk, but adds extra compute/latency.

Hybrid blends lexical + semantic retrieval to improve recall on exact terms.

Latencylow
Costlow
Accuracy riskmoderate

Lower is better for latency and cost. Lower accuracy risk means safer retrieval behavior.

Meters are relative trade-offs for this pattern, not measured production telemetry.

John Munn

Technical leader building scalable solutions and high-performing teams through strategic thinking and calm, reflective authority.

Connect

© 2026 John Munn. All rights reserved.