Banking Dispute Data Warehouse & Analytics Platform
From fragmented dispute records to a governed Snowflake warehouse with dimensional modeling and live Power BI insights.
The Challenge
A mid-sized bank needed a reliable way to analyze card-payment disputes across customers, cards, merchants, and transactions. Source data arrived from multiple systems with inconsistent schemas, duplicate IDs, mixed data types, and missing lineage. Reporting teams struggled to reconcile numbers and meet audit demands, and dashboards routinely required manual data fixes.
The requirement was clear: centralize raw feeds, enforce data quality and governance, and deliver an analytics-ready model with full traceability from dashboard back to source.
Our Approach
We implemented an ELT pipeline on Snowflake using a Medallion architecture and automated SQL procedures for cleaning, conformance, and dimensional modeling. The end state: a governed Gold layer powering self-serve analytics in Power BI.
- ELT on Snowflake: Land raw files first, then transform inside the warehouse for scale and auditability.
- Medallion Layers: Bronze (raw), Silver (validated), Gold (dimensional star schema).
- Automated Cleaning: Stored procedures for type casting, deduplication, null handling, ZIP normalization, and card expiry validation.
- Star Schema: Fact tables for transactions and dispute events linked to conformed dimensions (Customer, Card, Merchant, Date, Stage, Reason).
- Power BI: Direct model against Gold for dispute trends, resolution SLAs, and merchant performance.
Architecture on Snowflake
Bronze Layer
Raw ingestion of Customers, Cards, Merchants, Transactions, and Disputes. Immutable, timestamped data retained for lineage and reprocessing.
Silver Layer
Cleaning & standardization via SQL procedures: key normalization, safe type conversion, duplicate removal, missing value handling, card expiry parsing (YYYY-MM), and merchant ZIP validation.
Gold Layer
Dimensional model with surrogate keys: FACT_TRANSACTION, FACT_DISPUTE, and dimensions for Customer, Card, Merchant, Date, Dispute Stage, and Dispute Reason. Includes a time-aware table function for “as-of” snapshots.
Dimensional Model (Gold)
The fact tables capture business events; dimensions provide conformed context for fast, consistent analytics.
- FACT_TRANSACTION: One row per cleaned transaction; links to Customer, Card, Merchant, Date; attributes include channel, amount, error code, MCC at time of sale.
- FACT_DISPUTE: Event-grain table with two lifecycle rows per case (Pending, Resolved), stage-keyed with derived days-to-resolution and disputed amount.
- DIM_CUSTOMER / DIM_CARD / DIM_MERCHANT: Conformed entities with surrogate keys; standardized attributes for demographics, product, and MCC/location.
- DIM_DATE: Calendar hierarchy (day, week, month, quarter, year, weekend flag).
- DIM_DISPUTE_STAGE & DIM_DISPUTE_REASON: Standardized lifecycle and reason codes for consistent reporting.
Before vs After
Before

Siloed systems, inconsistent keys and types, duplicate IDs, unclear lineage, manual report fixes.
After

Governed Snowflake ELT, dimensional model with surrogate keys, audit lineage, and live Power BI dashboards.
Data Quality & Governance
- Deterministic deduplication using window functions on business keys (keep latest record).
- Type safety via TRY_TO_NUMBER / TO_DATE; consistent casing and trimmed keys.
- Card expiry validation: normalized YYYY-MM; invalid formats set to NULL for integrity.
- Merchant ZIP normalization: numeric-only, length checks, and leading-zero preservation.
- Temporal analysis: table function
TF_DISPUTE_ASOF(date)to reconstruct backlog “as of” any date. - Audit lineage: load timestamps and schema-qualified paths across Bronze → Silver → Gold.
The Results
- Trustworthy analytics: one conformed model replaces ad-hoc data prep and conflicting numbers.
- Operational visibility: dispute volumes, resolution SLAs, and merchant performance tracked in real time.
- Compliance-ready: audit trail from dashboard to source with reproducible SQL procedures.
- Team efficiency: automated refresh; analysts spend time on insights, not data cleaning.
Impact on the Business
The warehouse provides a single source of truth for dispute analytics. Leaders can evaluate backlog, spot bottlenecks in the resolution process, and compare performance by merchant, card product, or customer segment—all backed by governed data.
Why It Matters
Dispute operations carry financial, customer-experience, and regulatory risk. A dimensional model on Snowflake, fed by automated cleaning and lineage, gives teams the speed and confidence to act on accurate numbers—without manual reconciliation.
Future Readiness
The platform is designed to scale: new feeds (fraud, chargebacks), additional facts/dimensions, near-real-time ingestion via streams/tasks, semantic models, and expanded KPI suites. The Gold layer already supports direct connections from Power BI, with minimal modeling needed in the BI tool.
Want a Governed Dispute Analytics Platform?
We build Snowflake ELT pipelines with dimensional models and audit lineage—ready for live dashboards and regulatory reporting.