Ensure "Zero Keys" or "Ghost Records" exist in Hubs to handle late-arriving data or missing lookups without breaking the model. 3. Data Integrity & Reconciliation This ensures that "what went in is what came out."
All source data (even if "dirty") must be stored in the Raw Vault.
Data Vault excels at "insert-only" logic. Your tests should mirror this.
Verify every Hub has a unique business key and no duplicates. Ensure the Load_Date and Record_Source are present.
Intentionally feed "bad" data into the pipeline to ensure it is caught by error Satellites rather than crashing the load. Success Metrics (KPIs)
Data Vault should allow for high-concurrency loading.
Insert a record with a modified attribute. Verify that a new Satellite record is created with the updated data while the old record remains (historical tracking).