Skip to content

Releases: databrickslabs/lakebridge

v0.11.3

29 Dec 21:54
b6a901f

Choose a tag to compare

Analyzer

  • Optimized SAS Analyzer performance by consolidating regex operations, delivering roughly a 7x speed improvement for large-scale SAS analysis workloads.
  • Added support for new SSIS components Microsoft.Pivot, Microsoft.UnPivot, and ExtensibleFileTask, broadening coverage for SSIS package migrations analysis.

Converters – Morpheus

  • Core

    • Significantly improved ANTLR parsing performance by merging grammars, refactoring ambiguous rules, and updating the Scala integration and build pipeline for the new grammar workflow.
    • Allowed the STREAMS token to be used as an identifier so patterns like SELECT * FROM streams.foo.bar now parse correctly in Snowflake-oriented SQL.
    • Updated the error reporting to align to the following:
        • Info: no error, the input was fully translated
      • Hint: the input was fully translated but some irrelevant bits have been elided
      • Warning: the input was translated but with unsupported bits
      • Error: the input couldn't be translated
  • MSSQL / T-SQL / SQL Server

    • Added full support for SQL Server T-SQL CREATE INDEX and table-level index directives, parsing them into a new index IR and translating to CLUSTER BY AUTO in Databricks SQL so index statements are no longer rejected.

    • Extended grammar and parsing to handle T-SQL computed columns, QUOTENAME calls, GROUP options in query hints, DROP INDEX statements, and additional keywords like PARAMETERS, STREAMS, PROCEDURES, and VIEWS, improving coverage of real-world T-SQL workloads.

    • Improved DML parsing so INSERT targets use proper dot identifiers instead of expression-like forms, preventing misinterpretation as function calls and preserving case sensitivity where required.

    • Re-enabled and migrated T-SQL functional tests to a YAML-based format, expanding automated coverage and keeping still-failing cases isolated for follow-up.

Converters – BladeBridge

  • MSSQL / SSIS / T-SQL

    • Resolved issues with column names containing single quotes and standardized DATEADD and DATEDIFF function patterns to improve compatibility across target SQL dialects.
  • DataStage

    • Implemented mapping for the JulianDayFromDate function with corresponding tests, extending DataStage function coverage in the converter.

    • Enhanced DataStage Spark and workflow handling by adding Databricks cluster sections, improving widget default handling, and mapping TransformStringToDate and spark.sqltemplate attributes for smoother Spark migrations.

Reconcile

  • Improved reconciliation hash query generation to guarantee consistent column ordering across SQL dialects, preventing false hash mismatches when column names are substrings of each other.

  • Reverted the Oracle reconcile implementation to use MD5 via DBMS_CRYPTO.HASH with RAWTOHEX, restoring compatibility with Oracle 11 while keeping the updated QueryBuilder engine handling..

Documentation

  • Added practical details about how to extend BladeBridge configurations

Dependency updates:

  • Bump actions/checkout from 5 to 6 (#2158).

Contributors: @asnare, @sundarshankar89, @dependabot[bot], @m-abulazm, @BesikiML

v0.11.2

11 Dec 22:53
4190672

Choose a tag to compare

Analyzer

  • Normalized complexity categories in the analyzer from “COMPLEX/VERY_COMPLEX” to “HIGH/VERY_HIGH” for clearer reports.

Converters

Morpheus

Snowflake

  • Implemented full support for DECLARE, LET, and assignment statements to better handle procedural Snowflake scripts.
  • Added support for DROP PROCEDURE statements, improving Snowflake DDL coverage.

TSQL/Synapse

  • Cleaned up grammar by removing duplicate and unsupported rules for TSQL special functions, reducing ambiguity and improving parser stability.
  • Implemented full support for DECLARE, LET, and assignment statements in TSQL, enabling richer stored procedure conversion.
  • Added support for TSQL DROP PROCEDURE statements to improve parity with source DDL.
  • Updated handling of options such as ANSI_NULLS and QUOTED_IDENTIFIER to emit informative comments instead of errors when they do not apply to Databricks SQL.
  • Enhanced handling of SET NOCOUNT by emitting comments explaining its behavior in Databricks SQL and warning when NOCOUNT OFF is used.
  • Allowed PRECISION to be used as an identifier (for example, c.precision), fixing parsing issues with such column names.
  • Improved handling of EXEC statements by detecting well‑known stored procedures like sp_executesql and issuing more specific diagnostics.
  • Added translation of OBJECT_ID() checks into EXISTS queries against catalog metadata to preserve control flow in procedural TSQL.
  • Added warnings for unsupported PRINT statements by generating explanatory comments rather than hard errors.
  • Added parsing support for the Synapse RENAME OBJECT syntax, currently surfaced as an unsupported but recognized construct.

Generic Morpheus engine

  • Enabled attaching comments and error markers to empty code blocks so that diagnostics are preserved in rendered SQL.
  • Prevented semicolons from being printed after empty statements to keep output formatting consistent.
  • Bundled multiple column-level primary keys into composite table constraints to produce more correct DDL.
  • Allowed the identifier PRECISION in general parsing contexts, improving compatibility with more schemas.

BladeBridge

MSSQL / TSQL

  • Improved handling of MERGE statements, including insertion of semicolons before MERGE in statement breaking and correct ordering of MATCHED and NOT MATCHED clauses.
  • Fixed issues when converting updates on temporary tables into MERGE statements and added tests to guard the behavior.
  • Improved statement categorization by stripping comments before categorization and simplifying legacy comment-key handling.
  • Added a new handler for nested static strings and inline comments, improving function substitution and parser robustness.

Generic BladeBridge engine

  • Enhanced logging configuration to produce clearer diagnostics while keeping noise manageable.

Reconcile

  • Added support for specifying a catalog for Databricks sources in Reconcile and prompting for the source catalog when necessary.
  • Removed redundant Reconcile configuration parameters to simplify setup.

General

  • Improved handling of output from LSP servers by safely chunking very long stderr lines and logging critical processing errors, preventing hangs and unbounded memory use.
  • Adjusted JDBC handling to accept usernames and passwords via Spark options instead of embedding credentials in the JDBC URL, improving support for special characters in passwords.
  • Consolidated the automated test suite to keep only unit and integration scopes, simplifying test configuration.

Dependency Updates

  • Dependencies: update documentation (yarn) packages by @asnare in #2178

Full Changelog: v0.11.1...v0.11.2

Contributors: @m-abulazm, @asnare, @sundarshankar89

v0.11.1

26 Nov 23:26
338e93c

Choose a tag to compare

Analyzer

No updates in this release.

Converters

General

  • Improved end-to-end migration behavior through tighter integration with the centralized Morpheus function mapping layer and expanded cross-dialect coverage

Morpheus

Snowflake

  • Centralized SQL function mappings and expanded cross-dialect coverage, improving Snowflake-to-Databricks SQL conversions and reducing noisy, non-actionable warnings.
  • Added full translation support for Snowflake exception blocks, enabling richer error-handling logic to be preserved when converting to Databricks SQL.

TSQL / SQL Server

  • Reworked SQL function handling so most mappings are centralized, making TSQL-to-Databricks SQL conversions more accurate and easier to extend for future Lakebridge-based migrations.
  • Implemented full support for TSQL TRY/CATCH constructs, including THROW/RAISERROR-style logic and helper-based error handling, improving the fidelity of translated control-flow and error semantics.

BladeBridge

TSQL / SQL Server

  • Fixed handling of T-SQL column alias syntax in SELECT statements so aliases are no longer mistaken for variable assignments, and removed a deprecated alias-normalization method to improve translation accuracy.
  • Resolved failures caused by nested comments, improved post-conversion handling for shell and Python wrapper scripts, and ensured labeled UPDATE/DELETE statements that translate to MERGE remain correctly embedded in SQL.
  • Corrected processing of SELECT statements without a FROM clause when assigning to variables, so expressions like variable increments and severity mappings are handled reliably during migration.
  • Improved “delete by source” MERGE translations so separators and DELETE placement are preserved, and fixed static string handling so T-SQL patterns that use square brackets are not misinterpreted as identifier quoting or ranges.

Reconcile

No updates in this release

Documentation

  • Clarified that Python 3.14 is not yet supported and updated macOS instructions to recommend Python 3.13 as the latest supported version
  • Expanded installation prerequisites with detailed Databricks workspace requirements, authentication options, network and repository access expectations, and a comprehensive pre-installation checklist aimed at enterprise and security-restricted environments

General

  • Increased the maximum stderr line size accepted from LSP servers during transpilation to prevent crashes or hangs when converters emit very large log lines
  • Reduced noise from LSP integrations by lowering stderr mirroring from INFO to DEBUG level, ensuring detailed logs remain available for troubleshooting without cluttering normal operation logs

Contributors: @asnare, @andresgarciaf

v0.11.0

07 Nov 22:05
04f1df6

Choose a tag to compare

🎉 New Features

This release introduces two exciting new capabilities to Lakebridge:

Synapse Profiler

A powerful new Synapse Profiler feature is now available to help you analyze and profile your Synapse data. Refer to the documentation for usage details and examples.

Switch LLM Converter

Introducing the new Switch LLM converter, expanding Lakebridge's conversion capabilities. Refer to the documentation for usage details and examples.


Other updates

Converters

General

Conversion Output Fix
Fixed a bug where files nested 2 or more directories deep within the input directory could fail to be written out after conversion when the directory structure wasn't already in place.

Morpheus

Code Formatting Improvements
Refactored code formatting logic by introducing a tree-like structure in CodeBlock and a new CodeBlockRenderer to handle whitespace, comments, and error positioning, making the formatting system more maintainable and accurate.

TSQL

Added support for translating TSQL join hints (like REPLICATE and MERGE) to their Databricks SQL equivalents by transforming them into special /*+ ... */ comments after the SELECT keyword, while unsupported hints are flagged as annotated errors.

BladeBridge

SQL Server

  • Fixed SELECT INTO real table syntax, corrected LIKE pattern handling, and mapped unsupported FUNC_ROW_NUMBER function while removing ANON_NOLOCK.
  • Resolved an issue where CASE WHEN expressions as the last statement in a file generated incorrect semicolon placement in SQL scripts.
  • Added fragment breaker before GO keyword and removed unsupported COMMIT TRANSACTION and CREATE INDEX constraints.
  • Fixed T-SQL UPDATE statements that were not correctly converted to MERGE operations in specific cases.
  • Corrected fragment handling around SELECT and UNION statements, and fixed issues with IF condition blocks and error handling blocks being mixed up.
  • Removed SET IDENTITY_INSERT and BEGIN/COMMIT TRANSACTION statements, and changed INT GENERATED ALWAYS AS IDENTITY to BIGINT GENERATED ALWAYS AS IDENTITY.
  • Added validation check for converted MERGE statements, implemented global variable reset in init_hook subroutine, and performed code refactoring.
  • Fixed T-SQL DELETE statements that were not correctly converted to MERGE operations and added corresponding test cases.

Reconcile

Oracle

Improved Oracle support with the following enhancements:

  • Fixed Oracle JDBC URL by moving credentials out of URL into options and correcting thin syntax
  • Updated hashing/expression pipeline to replace RAWTOHEX(...), 2 with UTL_I18N.STRING_TO_RAW(...,'AL32UTF8'), 4 (SHA-256)
  • Fixed schema comparison for Oracle
  • Tweaked datatype parsing in default transformations for Oracle compatibility
  • Added Oracle jars in setup script
  • Extended integration scaffolding and added end-to-end tests

Snowflake

  • Fixed schema comparison for Snowflake
  • Adjusted log levels by demoting noisy warnings to debug/info
  • Added Snowflake jars in setup script
  • Extended integration scaffolding

Documentation

Added documentation for deploying reconciliation dashboards and updated documentation notebooks.

Dependency updates:

New Contributors

Full Changelog: v0.10.13...v0.11.0

Contributors: @goodwillpunning, @hiroyukinakazato-db, @sundarshankar89, @asnare, @m-abulazm, @dependabot[bot], @bishwajit-db

v0.10.13

28 Oct 04:02
51b2a05

Choose a tag to compare

Analyzer

  • Added defensive code to prevent analyzer crashes on DataStage files with empty array references - Fixes an issue where the DataStage analyzer would crash when encountering empty array references

Converters

Morpheus

General

  • Enhanced name representation consistency - Major refactoring that replaces String representations with Expression types for table names, column names, and constraints across IR nodes, improving SQL/PySpark code generation accuracy

  • Fixed DBT parsing issues - Resolved template parsing problems by changing template markers to !#Jinja0001#! format and improving whitespace handling for proper tokenization

TSQL (Synapse/SQL Server)

  • Support for dual OUTPUT clauses in TSQL INSERT/DELETE/UPDATE statements - Enhanced T-SQL parser to handle complex statements with multiple OUTPUT clauses (OUTPUT ... INTO ... OUTPUT ...) with comprehensive test coverage

  • Fixed TSQL DECLARE statement handling - Refactored DECLARE statement processing by moving logic to dedicated visitor methods and properly marking unsupported statements for future implementation

  • Improved BLOCK structure parsing for BEGIN and BEGIN TRY statements - Updated parser grammar to support flexible scripting blocks and transaction handling, allowing zero or more statements in control flow constructs

  • Added comprehensive USE statement support - Introduced new IR representations (UseCatalog, UseSchema) with dialect-specific AST building logic and proper SQL generation

Snowflake

  • Fixed Snowflake connection tests - Internal improvements for database connection test reliability

  • Added comprehensive USE statement support - Introduced new IR representations (UseCatalog, UseSchema) with dialect-specific AST building logic and proper SQL generation

BladeBridge

General

  • Automatically creates and cleans up temporary folders for embedded SQL conversion in wrapper scripts - Improves workflow management by implicitly creating temp folders and cleaning them up once conversion is complete

MSSQL (SQL Server)

  • Enhanced table variable and temporary table conversion - Added support for table variable conversion to temporary tables and improved string handling with logic to convert double single quotes to double quotes

  • Fixed semicolon placement in nested select statements - Resolved issue where semicolons appeared before comments in nested select statements

  • Improved MS SQL procedure handling - Added LIMIT 1 for Set in select statements, enhanced function mappings, fixed string concatenation, and removed unsupported constraints

Reconcile

No updates in this release.

Documentation

No updates in this release.

Contributors: @gueniai, @sundarshankar89

v0.10.12

16 Oct 21:11
2bf8292

Choose a tag to compare

Analyzer

  • New installation verification command - Introduced a new command to verify successful installation of the Lakebridge Analyzer, displaying usage and available flags for report file paths, source directories, and source technologies

Converters

General

  • Enhanced transpile command - Updated transpile command to support --overrides-path and --target-technology arguments for greater flexibility and customization

  • Improved error handling - Enhanced handling of parsing errors during code transpilation to output transpiled code instead of original input, providing clearer outcomes when issues arise

  • Refactored naming conventions - Renamed transpiler product_name to transpiler_id throughout the codebase for improved consistency and clarity

Morpheus

TSQL

  • Enhanced TSQL support - Added support for DENY statements, EXEC statement syntax improvements, COLLATION in CREATE TABLE column definitions, and WINDOW clause functionality

  • Improved ALTER DATABASE support - Enhanced support for all options on ALTER DATABASE SET statements and multiple LOG file specifications in ALTER DATABASE ADD LOG

  • Better JOIN functionality - Added support for all join hints (MERGE, HASH, LOOP, REDUCE, REPLICATE, REDISTRIBUTE) in JOIN constructs

  • Enhanced COPY INTO support - Fixed syntax for COPY INTO commands and added extended column definitions support in TSQL mode

  • Improved DELETE operations - Added transformation rule to translate IN to EXISTS when needed in DELETE statement WHERE clauses

Snowflake

  • COPY INTO improvements - Refactored and standardized grammar rules for COPY INTO commands, consolidating stage location handling

  • UPDATE FROM enhancements - Added tests for UPDATE FROM statements to verify correct transpilation to MERGE INTO statements

General

  • Enhanced permission handling - Added support for column-specific privileges and improved handling of column-specific permissions

  • Improved parser functionality - Allowed SCHEMAS keyword to be used as identifier and clarified warning messages for unrecognized functions

BladeBridge

MSSQL

  • Fixed update_to_merge functionality - Improved WITH clause handling and script variable ordering for MSSQL dialects

  • Table variable support - Implemented table variable conversion support for MSSQL dialects

  • DDL operation fixes - Fixed and removed unsupported DDL operations including alter index, switch partitions, and drop constraints

Informatica

  • Power Center improvements - Fixed hanging issue on Linux for Informatica PC conversion by improving block_subst patterns and output flushing

  • Dataframe implementation fixes - Fixed dataframe implementation for pulling data from flat file unconnected lookups in Informatica Power Center

DataStage

  • TRUNCATE TABLE support - Added spark.sql_template to resolve TRUNCATE TABLE statement generation when TRUNCATE flag is enabled in DataStage

Reconcile

  • Enhanced Databricks schema queries - Fixed Databricks schema query to improve accuracy and reliability of schema reconciliation, with better column name consistency and filtering

Documentation

  • Updated CLI documentation - Refreshed documentation to reflect latest changes in Command Line Interface menus, including new commands and flags such as transpile, reconcile, and install-transpile subcommands

  • Enhanced command documentation - Added detailed documentation for transpile command usage and flags, including optional flags for catalog name, error file path, and source dialect

  • Updated installation guides - Modified installation documentation to include verification examples and updated help flags for new command options
    Dependency updates:

  • Updated cryptography requirement from <45.1.0,>=44.0.2 to >=44.0.2,<46.1.0 (#2028).

  • Bump databrickslabs/sandbox/acceptance@acceptance/v0.4.2 from 0.4.2 to 0.4.4 (#1833).

Contributors: @asnare, @sundarshankar89, @m-abulazm, @dependabot[bot], @gueniai

v0.10.11

03 Oct 21:53
b8259a7

Choose a tag to compare

Analyzer

No updates in this release

Converters

General

  • Fixed special character handling in filenames by introducing from_uri() helper function for safer URI handling
  • Ensured SQL converter returns UTF-8 encoded files for proper character encoding
  • Fixed filename to correctly output databricks_conversion_supplements.py supplemental file
  • Fixed broken splitter URL by updating directory naming conventions from "Downloads" to "downloads"
  • Improved handling of encoding-related errors by catching UnicodeDecodeError and LookupError exceptions during file processing, creating TranspileError with specific encoding-error codes instead of stopping

Morpheus

Snowflake

  • Added support for TRUNCATE TABLE statements with proper IR and translation support
  • Correctly support $IDENTITY and $ROWGUID system variables
  • Refactored and extended grammar and AST support for SQL procedure creation with improved handling of raw string literals
  • Enhanced schema reconciliation functionality to support Snowflake arrays, addressing the corner case where Databricks arrays are typed and Snowflake arrays are untyped

TSQL

  • Added support for TRUNCATE TABLE statements with proper IR and translation support
  • Support full CREATE and ALTER INDEX statements in TSQL parsing, rejecting INDEX CREATE/ALTER statements sensibly instead of raising syntax errors
  • Fixed implementation of IF scripting blocks with improvements to SQL parser, grammar enhancements, and enhanced scripting grammar for more robust handling of block statements and conditional branches
  • Allow CLUSTERED to be an identifier to improve CREATE TABLE syntax as a CONSTRAINT qualifier
  • Support percentage expressions in TSQL options (e.g., OPT = 42%) instead of raising parsing errors
  • Added support for REVOKE statements, similar to existing GRANT statement implementation
  • Ensure that ROWS and OBJECTS can be used as identifiers even with Jinja templates
  • Correctly support $IDENTITY and $ROWGUID system variables

General (Multiple Dialects)

  • Support comments on column declarations when generating SQL and renamed legacy builders for consistency
  • Refactored IR around CREATE FUNCTION and CREATE PROCEDURE, unifying all ways to create stored procedures under a single CreateStoredProcedure IR node and all ways to create user defined functions under a single CreateUDF IR node
  • Implemented grammar and IR placeholders for named windows, introducing initial support for the SQL standard WINDOW clause in parser grammar

BladeBridge

Oracle

  • Removed unsupported Oracle DDL constraints (add/create constraint unique) and extraneous TBLPROPERTIES from converted output

MSSQL

  • Added handle_xml_nodes function for MS SQL processing
  • Fixed multiple MSSQL issues including CTEs in views/stored procedures, ADD CONSTRAINT problems, DEFAULT value handling, and parameter data types

Synapse

  • Fixed multiple Synapse issues including CTEs in views/stored procedures, ADD CONSTRAINT problems, DEFAULT value handling, parameter data types, error handling in stored procedures, and Synapse-specific features (e.g., table distribution)

Teradata

  • Added Teradata function mappings including ZEROIFNULL, TEMPORAL_TIMESTAMP, TRYCAST, ANY, FIRST, NULLIFZERO, DECODE with different parameter counts, and HASHAMP
  • Removed collect statistics and lock table statements

DataStage

  • Implemented DataStage Checksum component translation to SparkSQL equivalent and fixed Pyspark checksum translation to use MD5() instead of SHA2()

Reconcile

  • Added handling for special characters in reconcile aggregate, enhancing the library to handle special characters in column names by properly delimiting identifiers in SQL queries
  • Fixed deploy reconcile jobs by updating wheel file handling, simplifying deployment process to use single wheel path, and fixing broken documentation links

Documentation

  • Fixed download link in docs (reconcile automation) by replacing broken markdown link with JSX link utilizing useBaseUrl hook

General

  • Implemented new describe-transpile CLI subcommand that describes installed transpilers, including their versions, configuration paths, and supported source dialects
  • Switched from urllib to requests library for making HTTP calls to PyPI and Maven Central, with default 60-second timeout and improved error handling
  • Work around DATABRICKS_HOST normalization issue during install and uninstall by introducing new Lakebridge subclass with appropriate workspace client

Dependency updates

Special thanks to @BrianDeacon for his contribution to fix #1858

Contributors: @asnare, @m-abulazm, @ihor-ki, @goodwillpunning, @sundarshankar89, @dependabot[bot]

v0.10.10

25 Sep 03:56
3678301

Choose a tag to compare

Analyzer

  • Large XML file chunking optimization: Now the analyzer is able to handle large XML files (up to 1TB in size)

Converters

General

  • Non-interactive transpiler installation: Introduced support for non-interactive installation mode with new interactive option that can detect environment context, enabling automated installations without user input while preserving existing configurations. Resolves #2013

Morpheus

  • Enhanced GRANT statement support: Implemented comprehensive GRANT statement support by creating dedicated permission.g4 grammar file with IR definitions and translation rules for permission-related statements

  • Improved error handling: Rewrote print function to properly handle newlines and added extensive unit tests for error annotation, including block and FIXME comments. Resolves #2030

  • Enhanced LSP server behavior: Improved LSP server to append original text to error messages when transpilation fails, eliminating need for client-side response manipulation

  • Standardized dialect options: Aligned dialect options to present synapse and mssql to users for consistency with bladebridge

  • Fixed Lateral Column Alias handling: Enhanced dealiasing for Lateral Column Aliases (LCAs) in WHERE clauses under CASE...WHEN expressions. Resolves #1767

  • Enhanced GROUP BY/aggregation function dealiasing: Implemented dealiasing for Lateral Column Aliases in GROUP BY clauses and aggregation functions where LCA references are unsupported. Resolves (#956) and (#954)

  • Optimized Snowflake transformations: Reordered transformation rules to ensure TransformWithinGroup processes all cases before the call mapper. Resolves #1231

BladeBridge

  • Enhanced merge statement handlers: Improved merge statement processing to fix backtick handling, update operations without WHERE clauses, procedure conversions, IF-THEN-SET blocks, and various delimiter and mapping issues

  • Fixed view creation with WITH clauses: Corrected CREATE VIEW functionality to properly handle WITH clause statements

  • Oracle script improvements: Resolved variable declaration issues in Oracle scripts containing exception handling blocks

  • SQL Server function mapping: Added function mappings for Microsoft SQL Server functions including GETUTCDATE, IS_MEMBER, SERVERPROPERTY variants, and QUOTENAME with one or two arguments

  • Fixed variable declarations: Resolved variable declaration issues in Oracle scripts that contain exception handling blocks

  • MSSQL Server Enhanced function mappings: Added comprehensive function mappings including GETUTCDATE, IS_MEMBER, SERVERPROPERTY variants, and QUOTENAME with one or two arguments

Reconcile

  • Improved logging for aggregate reconciliation: Enhanced logging functionality to provide more accurate messages by replacing warning logs with informational messages when aggregate details rules are empty, indicating successful reconciliation with no details to store. Resolves #2040

  • Refactored aggregate query building: Simplified code using AggregateQueryBuilder class to generate queries for both source and target in a more concise and efficient manner

Documentation

No updates in this release

Dependency updates:

  • Bump actions/setup-python from 5 to 6 (#1988).

Contributors: @m-abulazm, @asnare, @dependabot[bot], @sundarshankar89

v0.10.9

12 Sep 20:58
9332501

Choose a tag to compare

Analyzer

  • Fixed bug where Analyzer would crash with large DDL files
  • Adjusted calculation of complexity for TSQL queries to make it more accurate

Transpilers

Morpheus

  • T-SQL Updates

    • Advanced Statement Support: Added parsing for CREATE CERTIFICATECREATE LOGINPRINT commands, and EXECUTE AS LOGIN statements
    • SET Command Enhancements: Support for complex assignment operators (+=-=*=/=%=&=^=|=) commonly used in T-SQL scripts
    • CREATE EXTERNAL TABLE: Improved parsing with flexible syntax for external table definitions and location specifications
    • GRANT/REVOKE Statements: Comprehensive support for T-SQL security statements with clear Unity Catalog migration guidance
    • DROP Commands: Enhanced handling of DROP SENSITIVITY and other specialized DROP variants
    • Improved Error Reporting: SQL output now includes FIXME comments with detailed error messages for unsupported constructs
  • Snowflake Updates

    • Analytics Functions: Full parsing support for MATCH_RECOGNIZE clause with pattern analysis capabilities for complex analytical queries
    • Time Travel Queries: Enhanced handling of CHANGESAT, and BEFORE clauses for historical data access patterns
    • REGEXP_INSTR Function: Complete implementation supporting all 7 parameters (vs Databricks' 2), providing accurate behavioral translation
    • Table-Valued Functions: Support for parsing inline table-valued functions commonly used in Snowflake
    • GRANT/REVOKE Statements: Full support for Snowflake's complex privilege management syntax including roles and shares
    • DROP Commands: Enhanced parsing for DROP SENSITIVITY and related data governance statements
    • Improved Error Reporting: SQL output now includes FIXME comments with detailed error messages for unsupported constructs

Dependency updates:

  • Bump actions/checkout from 4 to 5 (#1928).
  • Bump actions/upload-pages-artifact from 3 to 4 (#1964).
  • Bump mermaid from 11.6.0 to 11.10.1 in /docs/lakebridge (#1956).

Contributors: @dependabot[bot], @asnare, @m-abulazm

v0.10.8

09 Sep 03:54
4c377cd

Choose a tag to compare

Transpilers

General

  • SQL Validation Enhancement: Improved SQL validator to check only SQL outputs with enhanced error handling and support for various transpile results (#1949)
  • Error Handling Improvements: Added static error lookups for specific cases like unresolved routines and columns, with more readable exception messages
  • MIME Support: New functionality to support both MIME and non-MIME transpile results, including validation and output file management
  • LSP Server Integration: Log level now passed to Language Server Protocol (LSP) server via environment variable for greater flexibility (#1967)
  • Transpiler Auto-Upgrade: Enhanced installer to automatically upgrade existing Lakebridge transpilers during CLI upgrade process (#1978)
  • Source Dialect Handling: Fixed missing transpile source dialect handling to ensure correct assignment in configuration objects (#1985)

Morpheus

  • Enhanced Snowflake Conversion support:

    • Support for parsing ILIKE, EXCLUDE, REPLACE, RENAME with * LHS
    • Full support for EXCLUDE and RENAME clauses and all combinations
    • Fixed REPLACE function with optional third argument
    • Enhanced OBJECT_DELETE to accept 2 or more arguments
    • Accurate translation of Snowflake's REGEXP_REPLACE
  • Parser Improvements:

    • Allow lists of generic options with optional commas
    • EXTERNAL can now be used as an ID despite being documented as reserved
    • Support for DROP RULE syntax in TSQL
    • Allow DBT Jinja macros within JSON literals
    • Fixed bugs around DBT elseif and comment nodes
  • Error Handling: Upgraded SimpleError with support status and simplified user-facing parse error messages

  • Integration Alignment: Updated error handling to align with BladeBridge, now returning UNRESOLVED_ROUTINE errors consistently (#1998)

BladeBridge

  • XML Source Processing:

    • Automatic detection of XML sources with proper encoding preservation
    • Maintains UTF-8 encoding while respecting XML-specific encoding declarations
    • Prevents XML parser failures from encoding mismatches
  • SQL Scripting Enhancements:

    • Fixed nested comment handling in SQL scripts
    • Improved custom configuration handling for first-match processing
    • Removed unnecessary begin/end enclosures in pre/post SQL blocks
  • Teradata Updates: Enhanced convert_update_to_merge functionality

  • Oracle Updates:

    • Replaced list partitioning with CLUSTER BY statements
    • Removed unsupported CREATE INDEX and ALTER INDEX statements
    • Fixed CREATE PROCEDURE signature generation with proper exception handling
  • DataStage Updates:

    • Added support for TRUNCATE TABLE specifications (#1903)
    • Fixed column name handling when dataframe columns match job parameters
    • Enabled single-pass processing of shared containers
    • Resolved dataset component path issues for proper PySpark code generation

Reconcile

  • Schema Normalization: Added feature flag for identifier normalization with optional normalize parameter in get_schema method for flexible handling of different data source configurations (#1953)

Enhanced Connection Support

  • Snowflake Security: Added support for encrypted PEM private keys with pem_private_key_password field for secure authentication (#1869)
  • JDBC URL Handling: Improved JDBC URL arguments handling with enhanced error handling and logging
  • Connection Properties: Enhanced SecretsMixin class with new _get_secret_or_none method for better secret value retrieval
  • Error Handling: Introduced new exceptions like InvalidSnowflakePemPrivateKey for better error management

Documentation

Comprehensive Documentation Updates

  • MS SQL and Synapse: Enhanced documentation for reconcile connections including default secret naming conventions and required connection properties (#1954)
  • Connection Configuration: Added clear YAML format examples for MS SQL connection properties covering user, password, host, port, database, encryption, and trust server certificate
  • BladeBridge Updates: Minor naming correction from "Microsoft MS SQL Server" to "Microsoft SQL Server" while maintaining support for Oracle, Teradata, Netezza, Informatica, and DataStage
  • SQL Splitter: Updated documentation to remove RCT references, relocated to main menu with revised terminology using "Lakebridge" consistently (#1952)
  • Transpiler Discovery: Updated documentation for pluggable transpiler discovery and execution, introducing Morpheus and BladeBridge as Databricks-provided transpilers
  • Installation Process: Updated installation processes from Maven Central and PyPi with new directory structure for manual installations

General

Installation and Maintenance Improvements

  • Automated Upgrades: Streamlined installation process with automatic transpiler upgrades during CLI upgrade, eliminating need for separate upgrade commands
  • Plugin Management: Improved installation process for plugins like Bladebridge and Morpheus
  • Testing Enhancement: Added comprehensive test functions to validate SQL file transpilation with various scenarios including table creation and error handling

Contributors: @m-abulazm, @asnare, @sundarshankar89, @goodwillpunning, @gueniai