Reusable, pipeline-agnostic data quality framework built on PySpark. Plug into any Databricks notebook, AWS Glue job, or dbt post-hook. All thresholds are driven by YAML config — zero hardcoded values ...
SELECT 'STEP 1 COMPLETE: DEMO TABLES CREATED' AS step_title; STEP 2: INITIAL LOAD (BUILD SCD2 DIM + FACT) FROM OLTP SELECT 'STEP 2: INITIAL LOAD FROM OLTP INTO DEMO OLAP TABLES' AS step_title; ...