The RAG Atlas: A Visual Guide to Retrieval Patterns

Ten RAG architectures, visually mapped with interactive diagrams and a live simulator

February 20, 2026

12 min read

Select a pattern to inspect its retrieval flow. Hover nodes for details, then adjust controls to see trade-offs in real time.

Query

Embeddings

Chunks

Scores

SQL/tool

Vanilla RAG

hover nodes to inspectFocus mode

Vanilla RAG

The baseline: embed, search, generate

What it is

A single-pass retrieval pattern that embeds the user's query, finds the nearest matching chunks in a vector index, and gives those chunks to the LLM as grounding context.

Best for

Internal knowledge bases with clean, chunked documents

Core tradeoff

Simplest to build and debug vs Brittle on ambiguous queries

Failure mode

Returns plausible-sounding but wrong chunks when queries are ambiguous or the corpus has near-duplicate contradictory passages.

Detailsexpand

Pros

+ Simplest to build and debug
+ Lowest latency of all RAG variants
+ Single embedding model to manage

Cons

− Brittle on ambiguous queries
− No score calibration across chunks
− Sensitive to chunk size and overlap

Latency: 50–150ms retrieval + LLM generation. Total: 500–2000ms.

Cost: 1× retrieval. ~$0.0001 per query at ada-002 pricing.

Live Simulator

Adjust settings to see trade-off effects

Directional model only: this shows relative behavior, not production benchmarks.

Chunk size512 tokens

1282 048

Smaller chunks improve precision but can miss context; larger chunks add context but can add noise and cost.

Top-k5 chunks

120

Higher k increases recall but expands context and latency. Too high can dilute relevance.

Reranker

Reranking usually lowers accuracy risk, but adds extra compute/latency.

Hybrid search

Hybrid blends lexical + semantic retrieval to improve recall on exact terms.

Latencylow

Costlow

Accuracy riskmoderate

Lower is better for latency and cost. Lower accuracy risk means safer retrieval behavior.

Meters are relative trade-offs for this pattern, not measured production telemetry.