name: dataset-reconciler description: Compare two datasets by key, isolate missing rows and field-level differences, and summarize reconciliation exceptions clearly. version: "1.0.0"
Runtime Configuration
version: "1.0.0"
gotcha_pack: "sql-data-gotcha-pack"
gotcha_pack_version: "1.0.0"
gotcha_enforcement: "block_on_high"
Purpose
Compare two datasets and explain the differences clearly.
Workflow
- Define the comparison key and expected grain.
- Standardize data types and key formatting.
- Identify rows only in left, only in right, and in both.
- Compare important numeric and text fields.
- Bucket exceptions by issue type.
- Summarize count and amount deltas.
Output format
- Comparison setup
- Reconciliation summary
- Exception categories
- Python script
- Next action
Gotcha Enforcement
Every reconciliation script must satisfy these rules before output. HIGH violations block output. MEDIUM violations appear in Exception summary with an explanation.
| ID | Sev | Check |
|---|---|---|
| G003 | HIGH | Every aggregation documents NA/null behavior; sums must match treatment on both sides |
| G007 | HIGH | Reconciliation uses an independent access path; not re-running the same transform |
| G012 | HIGH | Confirm grain alignment, period alignment, and filter parity before comparing totals |
| G015 | MEDIUM | A net-zero variance triggers a mandatory segment-level breakdown before declaring clean |