Sources
Operational systems, SaaS tools, files, and event streams that originate data.
Operational objective: Keep source contracts stable so downstream ingestion is predictable release to release.
Key dependency: Depends on producer ownership and explicit schema/version communication.
First response when this stage degrades: When breakage appears, freeze ingest-on-fail sources and enforce contract validation gates.
What can go wrong: Undefined data contracts let breaking source changes ship without notice; ingestion failures spike the next release cycle.
Typical techniques:
These controls are commonly used here to reduce repeat incidents and stabilize handoffs to downstream stages.
Source contracts
Define producer/consumer expectations before schema changes ship.
Schema registry
Version and validate schemas so incompatible events are blocked early.
Data profiling
Baseline distributions to catch null spikes, type drift, and outliers.
Signals to watch: Source coverage • Contract violation rate • Schema change frequency • Null rate by source
