Skip to content

feat(powerbi): walk IfExpression branches and expand NativeQuery platform support#16757

Open
askumar27 wants to merge 3 commits intomasterfrom
feat/powerbi-pattern-handlers
Open

feat(powerbi): walk IfExpression branches and expand NativeQuery platform support#16757
askumar27 wants to merge 3 commits intomasterfrom
feat/powerbi-pattern-handlers

Conversation

@askumar27
Copy link
Contributor

@askumar27 askumar27 commented Mar 24, 2026

📋 Summary

Two targeted fixes to the PowerBI M-Query lineage layer that unblock previously-silent lineage extractions:

  1. IfExpression support — the M-Query resolver now walks both branches of if/then/else expressions, so conditional data sources (e.g. dev/prod environment switching) emit lineage from both environments instead of silently producing nothing.
  2. NativeQuery platform expansionValue.NativeQuery wrapping Sql.Database (MSSql) or PostgreSQL.Database now proceeds to SQL parsing instead of bailing out with "unsupported platform", enabling table-level lineage extraction from inline SQL strings on those platforms.

🎯 Motivation

After replacing the Lark parser with the Microsoft powerquery-parser bridge (#16685), the resolver became accurate — but the pattern handler layer had two gaps that silently dropped lineage for common real-world patterns:

  1. Conditional data sources — M-Query expressions that switch between environments (dev/prod) using if/then/else produced zero lineage because _walk() had no handler for IfExpression nodes.
  2. MSSql and Postgres Value.NativeQuery — inline SQL wrapped in Value.NativeQuery(Sql.Database(...), "SELECT ...", null) silently bailed out because Sql.Database and PostgreSQL.Database were not in the NativeQueryLineage platform map.

🔧 Changes Overview

Change 1 — IfExpression support in the resolver (resolver.py)

Before: Any M-Query with a conditional data source hit the unhandled fallthrough and logged a debug message. Zero lineage emitted.

// Before: produces NO lineage
let
    Source = if IsProduction
        then Sql.Database("prod-server", "prod_db")
        else Sql.Database("dev-server", "dev_db"),
    mytable = Source{[Schema="dbo", Item="orders"]}[Data]
in
    mytable

After: Both branches are walked and lineage is captured from both environments.

// After: produces lineage for BOTH branches
urn:li:dataset:(urn:li:dataPlatform:mssql,prod_db.dbo.orders,PROD)
urn:li:dataset:(urn:li:dataPlatform:mssql,dev_db.dbo.orders,PROD)

DataHub captures lineage from both branches rather than guessing which is active — consistent with how ListExpression (used by Table.Combine) is handled.

The implementation uses seen.copy() for the false branch, matching the ListExpression pattern, so sibling paths don't incorrectly trigger circular-reference detection.


Change 2 — NativeQuery platform expansion (pattern_handler.py)

Before: Value.NativeQuery only recognized Snowflake, Redshift, and DatabricksMultiCloud as valid inner platforms. Wrapping Sql.Database(...) or PostgreSQL.Database(...) silently returned empty lineage.

// Before: produces NO lineage — Sql.Database not in platform map
let
    Source = Value.NativeQuery(
        Sql.Database("myserver", "mydb"),
        "SELECT * FROM dbo.orders",
        null
    )
in
    Source

After: Sql.Database (MSSql) and PostgreSQL.Database are now recognized, so SQL parsing proceeds and table-level lineage is extracted from the raw query string.

// After: SQL is parsed and lineage is extracted
urn:li:dataset:(urn:li:dataPlatform:mssql,mydb.dbo.orders,PROD)

🏗️ Architecture/Design Notes

  • IfExpression follows the same open/closed pattern as the other _walk() node handlers — early-return per kind, with seen.copy() for independent execution paths.
  • No changes to the pattern handler classes themselves — NativeQueryLineage already had the is_native_parsing_supported() guard and current_data_platform field; only the platform map needed extending.
  • Sql.Databases (plural / three-level navigation) is handled in a separate PR (feat(ingest/powerbi): support 'Sql.Databases' M-Query data access function #16616) and intentionally excluded here.

🧪 Testing

  • test_if_expression_walks_both_branches — integration test using a real M-Query with if/then/else; asserts both branch URNs are present in lineage output.
  • test_native_query_mssql_and_postgres_supported — asserts Sql.Database and PostgreSQL.Database are recognized; asserts unknown platforms (e.g. Excel.Workbook) are not.

📊 Impact Assessment

  • Affected components: m_query/resolver.py, m_query/pattern_handler.py
  • Breaking changes: None — purely additive; previously-unhandled nodes now emit lineage instead of silently returning empty.
  • Risk level: Low — no changes to existing node handlers or pattern classes; new code paths only activate for IfExpression nodes and the two new platform map entries.

🔗 References

…form support

- Add IfExpression handler to _walk() in resolver.py so conditional data
  sources (e.g. dev/prod switching) produce lineage from both branches;
  uses seen.copy() for the false branch matching the ListExpression pattern
- Expand NativeQueryLineage.SUPPORTED_NATIVE_QUERY_DATA_PLATFORM to include
  Sql.Database (MSSql) and PostgreSQL.Database, enabling Value.NativeQuery
  inline-SQL lineage extraction for those platforms
@github-actions
Copy link
Contributor

Linear: ING-2052

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Mar 24, 2026
@codecov
Copy link

codecov bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

@askumar27 askumar27 changed the title feat(ingest/powerbi): walk IfExpression branches and expand NativeQuery platform support feat(powerbi): walk IfExpression branches and expand NativeQuery platform support Mar 24, 2026
@datahub-connector-tests
Copy link

Connector Tests Results

All connector tests passed for commit d7c1af3

View full test logs →

To skip connector tests, add the skip-connector-tests label (org members only).

Autogenerated by the connector-tests CI pipeline.

@maggiehays maggiehays added the needs-review Label for PRs that need review from a maintainer. label Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ingestion PR or Issue related to the ingestion of metadata needs-review Label for PRs that need review from a maintainer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants