Skip to content
Case Study BI Engineering March 2026

Power BI Automated Measure Testing with PBIP

semantic-model validation pbip testing
Portfolio-safe validation workflow frame

Summary

What changed, in short

A reusable workflow that turns semantic-model changes into reviewable code: every change produces a diff a reviewer can read, every risky DAX pattern raises an explicit flag, and the deployment path is explicit about what is being promoted and why. Running on 10+ datasets and public for inspection.

Outcome-first metrics

Outcome-first metrics

Outcome signal

Public repo + Mar 2026 talk

Reusable TMDL/PBIR inspection and DAX risk detection workflow for ongoing production maintenance

Focus

Production maintenance validation workflow

Industry: BI Engineering

My role

Sole designer and maintainer of the validation workflow

Tools: PBIP, TMDL, PBIR, Python

Problem

What needed to change

Maintaining 10+ production Power BI datasets and reports alone made measure-level regressions easy to miss. Changes were reviewed manually, if at all, and confidence in deployment dropped as the models grew. The underlying issue: a `.pbix` opened and saved is a review black box — nothing compares the model before and after.

Context / Constraints

What shaped the work

Industry context: BI Engineering. Primary focus: Semantic Model Engineering. The scope and sequencing were shaped by concrete delivery constraints.

  • Solo maintenance of 10+ production datasets and reports without a structured validation step

Approach

How the work was handled

Moved the dataset and report assets to PBIP format so TMDL (model) and PBIR (report) are plain-text files Git can track. Added a Python inspection layer that surfaces structural diffs and runs pattern-based risk checks against DAX before deployment. AI-assisted drafting generates candidate test scenarios; review gates keep anything automated from auto-approving itself into production.

Outcome

What changed in practice

A reusable workflow that turns semantic-model changes into reviewable code: every change produces a diff a reviewer can read, every risky DAX pattern raises an explicit flag, and the deployment path is explicit about what is being promoted and why. Running on 10+ datasets and public for inspection.

  • Built the workflow to support solo maintenance of 10+ production datasets and reports.
  • Implemented TMDL and PBIR inspection for semantic model structure checks.
  • Added DAX risk detection before deployment.
  • Generated draft validation scenarios while preserving review control.

My Role

Where I contributed most

Sole designer and maintainer of the validation workflow

Trade-offs / Lessons

Choices, constraints, and what mattered

  • Built the workflow around PBIP, Python, and AI-assisted tooling.
  • Kept AI-assisted output review-controlled rather than auto-approved.
  • Turned an internal maintenance need into a public repo and Mar 2026 speaker topic.

Additional Notes

Extra implementation detail

What the workflow looks like

The core idea: treat every Power BI change as a code change with a reviewable diff.

TMDL is plain text, so a measure change shows up as a Git diff a reviewer can read line by line (simplified):

measure 'Revenue (YTD)' =
    CALCULATE(
-       SUM('Sales'[Amount]),
+       SUM('Sales'[NetAmount]),
        DATESYTD('Date'[Date])
    )

The Python inspection layer parses TMDL and PBIR files and runs pattern-based risk checks against DAX changes. A simplified risk rule looks like this:

def check_measure(expression: str, model: Model) -> list[RiskFlag]:
    flags = []
    if uses_related_without_userelationship(expression):
        flags.append(RiskFlag(
            level="warn",
            message="Measure relies on the active relationship; "
                    "pin with USERELATIONSHIP or document the assumption.",
        ))
    if references_calculated_column(expression, model):
        flags.append(RiskFlag(
            level="info",
            message="Calculated column dependency — confirm refresh cost.",
        ))
    return flags

AI-assisted drafting produces candidate test scenarios — YoY boundary dates, blank-slicer combinations, measure-interaction edge cases. A reviewer approves, edits, or rejects each one. The AI never deploys.

Why PBIP matters here

Without PBIP, reviewing a Power BI change means opening two files in Desktop and squinting. With PBIP, the change is text; diffs, linting, and automated checks become possible without leaving Git. The workflow above is downstream of that one format decision.

Highlights

  • Built from the need to maintain 10+ production datasets and reports with stronger validation.
  • TMDL / PBIR inspection turns Power BI changes into reviewable diffs, not screenshots.
  • Pattern-based DAX risk detection catches known-bad constructs before deploy.
  • AI-assisted scenario drafting preserves the human review step — useful, not autonomous.
  • Public repo and Mar 2026 speaker session for peer inspection.

Contact

Need similar help with reporting, model quality, or BI delivery?

Start with the current constraint, what needs to change, and where delivery risk is showing up now.