What you get from a BayesIQ audit
Real artifacts from a real audit — not a PDF of recommendations. The Audit Kit produces scored findings, column-level profiles, data contracts, metric specs, a deployable dbt project, and interactive dashboards.
Pipeline artifacts
Every audit produces these files. They land in your repo or shared drive — no proprietary portal required.
audit_report.mdScored findings with severity, root cause, evidence, and fix recommendations. Every issue is tied to a specific event, column, or query.
dataset_profile.jsonColumn-level profiling for every table: data types, null rates, cardinality, top values, and distribution summaries.
quality_checks.jsonMachine-readable findings for integration into CI pipelines, alerting systems, or internal dashboards.
ASSUMPTIONS.mdData contracts documenting schema assumptions, quality expectations, temporal patterns, and entity relationships. Your team signs off before we build.
METRICS.mdMetric definitions with exact formulas, source events, dimensions, granularity, and validation rules.
dbt projectComplete dbt project with staging models, mart models, schema tests, and source definitions. Ready to deploy to your warehouse.
Streamlit dashboardInteractive app with sidebar filters, time series charts, dimension breakdowns, and a data quality summary. Usable from day one.
canonicalization_mapping.jsonNaming inconsistencies across platforms and pipelines mapped to canonical forms. Feed it into your dbt project or ETL layer.
Example findings from audit_report.md
Anonymized excerpt from an Audit Kit run on a B2B SaaS product (~50 M events/month). Finding IDs, event names, and property names have been changed.
checkout_completed fires on payment attempt, not payment confirmation — 23% funnel inflation.
Root Cause
Client-side event triggered before async confirmation callback resolves.
Recommended Fix
Move event dispatch into confirmation callback; backfill last 90 days using server-side order records.
user_id null in 18% of mobile web page_view events.
Root Cause
Anonymous session handling does not wait for identity resolution before firing the event.
Recommended Fix
Delay event dispatch by 300 ms post-load or use a queue that flushes after identity resolves.
revenue_daily excludes late refunds processed after midnight UTC. Net revenue overstated by ~4.2% month-over-month.
Root Cause
JOIN condition uses transaction_date instead of event_date for the refund table, silently dropping late refunds.
Recommended Fix
Update JOIN key to refund_issued_date; re-run historical aggregation for the trailing 12 months.
activation_rate query doesn’t match current definition — stale WHERE clause counts any feature_used event instead of three distinct features within 7 days.
Root Cause
Metric query was written before the activation definition was finalized and was never updated.
Recommended Fix
Rewrite metric query to match current definition; add a test that checks the query against the spec document.
experiment_viewed deduplicates by session instead of timestamp. Impression counts understated.
Root Cause
Deduplication logic uses session ID instead of a (session_id, timestamp) composite key.
Recommended Fix
Update deduplication key; note that historical impression data cannot be corrected.
device_type inconsistent across platforms — iOS sends "iPhone", Android sends "ios", web sends "iOS".
Root Cause
Inconsistent client library versions across platforms.
Recommended Fix
Standardize on enumerated values; add schema validation rule to catch raw user-agent strings.
Scoring rubric (0–100)
Every audit produces an overall health score. The score reflects the count, severity, and blast radius of confirmed issues.
| Score | Rating | What it means |
|---|---|---|
| 90–100 | Strong | Minor issues only. Data infrastructure is well-maintained and trustworthy. |
| 70–89 | Needs Work | Significant issues requiring attention. Key metrics may be directionally correct but unreliable for precise decisions. |
| 0–69 | At Risk | Critical issues affecting key metrics. Decisions based on this data are likely incorrect. |
Severity definitions
Every finding is ranked by business impact and blast radius — how many downstream metrics or reports does this affect?
| Severity | Definition | Typical action |
|---|---|---|
| Critical | Metric is systematically wrong. Decisions made on this data are likely incorrect. | Fix before next reporting cycle. |
| High | Significant inaccuracy in a key metric. Risk of misleading product or business decisions. | Fix in 2–4 weeks. |
| Medium | Partial data loss or inconsistency. Metric is directionally correct but unreliable for precise decisions. | Schedule in next sprint. |
| Low | Minor discrepancy or edge-case gap. Negligible business impact at current scale. | Address opportunistically. |
Engagement timeline — 6 weeks
A full engagement runs 6 weeks from kickoff to validated dashboards. Diagnostic sprints deliver findings in 1 week.
Ingest + Automated Pipeline + Expert Review
Week 1–2Architecture review, access setup, logging spec collection. Automated pipeline profiles every table and column, flags anomalies, and generates scored findings. Data scientists review results, eliminate false positives, and assess root causes.
Assumptions Sign-off + Metric Specification
Week 3–4ASSUMPTIONS.md and METRICS.md delivered. Your team reviews data contracts and metric definitions — this is the alignment gate. Nothing gets built until both sides agree on what the data should look like.
dbt Build + Dashboards + Training
Week 5–6Auto-generated dbt project with staging/mart models and schema tests. Interactive Streamlit dashboards built on validated metrics. Handoff session with your team covering the dbt project, dashboard usage, and ongoing monitoring.
See it on your data
Drop a CSV in the playground for instant profiling, or book a diagnostic sprint.
Get in Touch