Releases: databrickslabs/lakebridge
v0.11.3
Analyzer
- Optimized SAS Analyzer performance by consolidating regex operations, delivering roughly a 7x speed improvement for large-scale SAS analysis workloads.
- Added support for new SSIS components Microsoft.Pivot, Microsoft.UnPivot, and ExtensibleFileTask, broadening coverage for SSIS package migrations analysis.
Converters – Morpheus
-
Core
- Significantly improved ANTLR parsing performance by merging grammars, refactoring ambiguous rules, and updating the Scala integration and build pipeline for the new grammar workflow.
- Allowed the STREAMS token to be used as an identifier so patterns like SELECT * FROM streams.foo.bar now parse correctly in Snowflake-oriented SQL.
- Updated the error reporting to align to the following:
-
Info: no error, the input was fully translated
Hint: the input was fully translated but some irrelevant bits have been elidedWarning: the input was translated but with unsupported bitsError: the input couldn't be translated
-
-
MSSQL / T-SQL / SQL Server
-
Added full support for SQL Server T-SQL CREATE INDEX and table-level index directives, parsing them into a new index IR and translating to CLUSTER BY AUTO in Databricks SQL so index statements are no longer rejected.
-
Extended grammar and parsing to handle T-SQL computed columns, QUOTENAME calls, GROUP options in query hints, DROP INDEX statements, and additional keywords like PARAMETERS, STREAMS, PROCEDURES, and VIEWS, improving coverage of real-world T-SQL workloads.
-
Improved DML parsing so INSERT targets use proper dot identifiers instead of expression-like forms, preventing misinterpretation as function calls and preserving case sensitivity where required.
-
Re-enabled and migrated T-SQL functional tests to a YAML-based format, expanding automated coverage and keeping still-failing cases isolated for follow-up.
-
Converters – BladeBridge
-
MSSQL / SSIS / T-SQL
- Resolved issues with column names containing single quotes and standardized DATEADD and DATEDIFF function patterns to improve compatibility across target SQL dialects.
-
DataStage
-
Implemented mapping for the JulianDayFromDate function with corresponding tests, extending DataStage function coverage in the converter.
-
Enhanced DataStage Spark and workflow handling by adding Databricks cluster sections, improving widget default handling, and mapping TransformStringToDate and spark.sqltemplate attributes for smoother Spark migrations.
-
Reconcile
-
Improved reconciliation hash query generation to guarantee consistent column ordering across SQL dialects, preventing false hash mismatches when column names are substrings of each other.
-
Reverted the Oracle reconcile implementation to use MD5 via DBMS_CRYPTO.HASH with RAWTOHEX, restoring compatibility with Oracle 11 while keeping the updated QueryBuilder engine handling..
Documentation
- Added practical details about how to extend BladeBridge configurations
Dependency updates:
- Bump actions/checkout from 5 to 6 (#2158).
Contributors: @asnare, @sundarshankar89, @dependabot[bot], @m-abulazm, @BesikiML
v0.11.2
Analyzer
- Normalized complexity categories in the analyzer from “COMPLEX/VERY_COMPLEX” to “HIGH/VERY_HIGH” for clearer reports.
Converters
Morpheus
Snowflake
- Implemented full support for
DECLARE,LET, and assignment statements to better handle procedural Snowflake scripts. - Added support for
DROP PROCEDUREstatements, improving Snowflake DDL coverage.
TSQL/Synapse
- Cleaned up grammar by removing duplicate and unsupported rules for TSQL special functions, reducing ambiguity and improving parser stability.
- Implemented full support for
DECLARE,LET, and assignment statements in TSQL, enabling richer stored procedure conversion. - Added support for TSQL
DROP PROCEDUREstatements to improve parity with source DDL. - Updated handling of options such as
ANSI_NULLSandQUOTED_IDENTIFIERto emit informative comments instead of errors when they do not apply to Databricks SQL. - Enhanced handling of
SET NOCOUNTby emitting comments explaining its behavior in Databricks SQL and warning whenNOCOUNT OFFis used. - Allowed
PRECISIONto be used as an identifier (for example,c.precision), fixing parsing issues with such column names. - Improved handling of
EXECstatements by detecting well‑known stored procedures likesp_executesqland issuing more specific diagnostics. - Added translation of
OBJECT_ID()checks intoEXISTSqueries against catalog metadata to preserve control flow in procedural TSQL. - Added warnings for unsupported
PRINTstatements by generating explanatory comments rather than hard errors. - Added parsing support for the Synapse
RENAME OBJECTsyntax, currently surfaced as an unsupported but recognized construct.
Generic Morpheus engine
- Enabled attaching comments and error markers to empty code blocks so that diagnostics are preserved in rendered SQL.
- Prevented semicolons from being printed after empty statements to keep output formatting consistent.
- Bundled multiple column-level primary keys into composite table constraints to produce more correct DDL.
- Allowed the identifier
PRECISIONin general parsing contexts, improving compatibility with more schemas.
BladeBridge
MSSQL / TSQL
- Improved handling of
MERGEstatements, including insertion of semicolons beforeMERGEin statement breaking and correct ordering ofMATCHEDandNOT MATCHEDclauses. - Fixed issues when converting updates on temporary tables into
MERGEstatements and added tests to guard the behavior. - Improved statement categorization by stripping comments before categorization and simplifying legacy comment-key handling.
- Added a new handler for nested static strings and inline comments, improving function substitution and parser robustness.
Generic BladeBridge engine
- Enhanced logging configuration to produce clearer diagnostics while keeping noise manageable.
Reconcile
- Added support for specifying a catalog for Databricks sources in Reconcile and prompting for the source catalog when necessary.
- Removed redundant Reconcile configuration parameters to simplify setup.
General
- Improved handling of output from LSP servers by safely chunking very long stderr lines and logging critical processing errors, preventing hangs and unbounded memory use.
- Adjusted JDBC handling to accept usernames and passwords via Spark options instead of embedding credentials in the JDBC URL, improving support for special characters in passwords.
- Consolidated the automated test suite to keep only unit and integration scopes, simplifying test configuration.
Dependency Updates
Full Changelog: v0.11.1...v0.11.2
Contributors: @m-abulazm, @asnare, @sundarshankar89
v0.11.1
Analyzer
No updates in this release.
Converters
General
- Improved end-to-end migration behavior through tighter integration with the centralized Morpheus function mapping layer and expanded cross-dialect coverage
Morpheus
Snowflake
- Centralized SQL function mappings and expanded cross-dialect coverage, improving Snowflake-to-Databricks SQL conversions and reducing noisy, non-actionable warnings.
- Added full translation support for Snowflake exception blocks, enabling richer error-handling logic to be preserved when converting to Databricks SQL.
TSQL / SQL Server
- Reworked SQL function handling so most mappings are centralized, making TSQL-to-Databricks SQL conversions more accurate and easier to extend for future Lakebridge-based migrations.
- Implemented full support for TSQL TRY/CATCH constructs, including THROW/RAISERROR-style logic and helper-based error handling, improving the fidelity of translated control-flow and error semantics.
BladeBridge
TSQL / SQL Server
- Fixed handling of T-SQL column alias syntax in SELECT statements so aliases are no longer mistaken for variable assignments, and removed a deprecated alias-normalization method to improve translation accuracy.
- Resolved failures caused by nested comments, improved post-conversion handling for shell and Python wrapper scripts, and ensured labeled UPDATE/DELETE statements that translate to MERGE remain correctly embedded in SQL.
- Corrected processing of SELECT statements without a FROM clause when assigning to variables, so expressions like variable increments and severity mappings are handled reliably during migration.
- Improved “delete by source” MERGE translations so separators and DELETE placement are preserved, and fixed static string handling so T-SQL patterns that use square brackets are not misinterpreted as identifier quoting or ranges.
Reconcile
No updates in this release
Documentation
- Clarified that Python 3.14 is not yet supported and updated macOS instructions to recommend Python 3.13 as the latest supported version
- Expanded installation prerequisites with detailed Databricks workspace requirements, authentication options, network and repository access expectations, and a comprehensive pre-installation checklist aimed at enterprise and security-restricted environments
General
- Increased the maximum stderr line size accepted from LSP servers during transpilation to prevent crashes or hangs when converters emit very large log lines
- Reduced noise from LSP integrations by lowering stderr mirroring from INFO to DEBUG level, ensuring detailed logs remain available for troubleshooting without cluttering normal operation logs
Contributors: @asnare, @andresgarciaf
v0.11.0
🎉 New Features
This release introduces two exciting new capabilities to Lakebridge:
Synapse Profiler
A powerful new Synapse Profiler feature is now available to help you analyze and profile your Synapse data. Refer to the documentation for usage details and examples.
Switch LLM Converter
Introducing the new Switch LLM converter, expanding Lakebridge's conversion capabilities. Refer to the documentation for usage details and examples.
Other updates
Converters
General
Conversion Output Fix
Fixed a bug where files nested 2 or more directories deep within the input directory could fail to be written out after conversion when the directory structure wasn't already in place.
Morpheus
Code Formatting Improvements
Refactored code formatting logic by introducing a tree-like structure in CodeBlock and a new CodeBlockRenderer to handle whitespace, comments, and error positioning, making the formatting system more maintainable and accurate.
TSQL
Added support for translating TSQL join hints (like REPLICATE and MERGE) to their Databricks SQL equivalents by transforming them into special /*+ ... */ comments after the SELECT keyword, while unsupported hints are flagged as annotated errors.
BladeBridge
SQL Server
- Fixed SELECT INTO real table syntax, corrected LIKE pattern handling, and mapped unsupported FUNC_ROW_NUMBER function while removing ANON_NOLOCK.
- Resolved an issue where CASE WHEN expressions as the last statement in a file generated incorrect semicolon placement in SQL scripts.
- Added fragment breaker before GO keyword and removed unsupported COMMIT TRANSACTION and CREATE INDEX constraints.
- Fixed T-SQL UPDATE statements that were not correctly converted to MERGE operations in specific cases.
- Corrected fragment handling around SELECT and UNION statements, and fixed issues with IF condition blocks and error handling blocks being mixed up.
- Removed SET IDENTITY_INSERT and BEGIN/COMMIT TRANSACTION statements, and changed INT GENERATED ALWAYS AS IDENTITY to BIGINT GENERATED ALWAYS AS IDENTITY.
- Added validation check for converted MERGE statements, implemented global variable reset in init_hook subroutine, and performed code refactoring.
- Fixed T-SQL DELETE statements that were not correctly converted to MERGE operations and added corresponding test cases.
Reconcile
Oracle
Improved Oracle support with the following enhancements:
- Fixed Oracle JDBC URL by moving credentials out of URL into options and correcting thin syntax
- Updated hashing/expression pipeline to replace
RAWTOHEX(...), 2withUTL_I18N.STRING_TO_RAW(...,'AL32UTF8'), 4(SHA-256) - Fixed schema comparison for Oracle
- Tweaked datatype parsing in default transformations for Oracle compatibility
- Added Oracle jars in setup script
- Extended integration scaffolding and added end-to-end tests
Snowflake
- Fixed schema comparison for Snowflake
- Adjusted log levels by demoting noisy warnings to debug/info
- Added Snowflake jars in setup script
- Extended integration scaffolding
Documentation
Added documentation for deploying reconciliation dashboards and updated documentation notebooks.
Dependency updates:
- Bump actions/setup-node from 5 to 6 by @dependabot[bot] in #2094
New Contributors
- @hiroyukinakazato-db made their first contribution in #2066
Full Changelog: v0.10.13...v0.11.0
Contributors: @goodwillpunning, @hiroyukinakazato-db, @sundarshankar89, @asnare, @m-abulazm, @dependabot[bot], @bishwajit-db
v0.10.13
Analyzer
- Added defensive code to prevent analyzer crashes on DataStage files with empty array references - Fixes an issue where the DataStage analyzer would crash when encountering empty array references
Converters
Morpheus
General
-
Enhanced name representation consistency - Major refactoring that replaces String representations with Expression types for table names, column names, and constraints across IR nodes, improving SQL/PySpark code generation accuracy
-
Fixed DBT parsing issues - Resolved template parsing problems by changing template markers to
!#Jinja0001#!format and improving whitespace handling for proper tokenization
TSQL (Synapse/SQL Server)
-
Support for dual OUTPUT clauses in TSQL INSERT/DELETE/UPDATE statements - Enhanced T-SQL parser to handle complex statements with multiple OUTPUT clauses (OUTPUT ... INTO ... OUTPUT ...) with comprehensive test coverage
-
Fixed TSQL DECLARE statement handling - Refactored DECLARE statement processing by moving logic to dedicated visitor methods and properly marking unsupported statements for future implementation
-
Improved BLOCK structure parsing for BEGIN and BEGIN TRY statements - Updated parser grammar to support flexible scripting blocks and transaction handling, allowing zero or more statements in control flow constructs
-
Added comprehensive USE statement support - Introduced new IR representations (UseCatalog, UseSchema) with dialect-specific AST building logic and proper SQL generation
Snowflake
-
Fixed Snowflake connection tests - Internal improvements for database connection test reliability
-
Added comprehensive USE statement support - Introduced new IR representations (UseCatalog, UseSchema) with dialect-specific AST building logic and proper SQL generation
BladeBridge
General
- Automatically creates and cleans up temporary folders for embedded SQL conversion in wrapper scripts - Improves workflow management by implicitly creating temp folders and cleaning them up once conversion is complete
MSSQL (SQL Server)
-
Enhanced table variable and temporary table conversion - Added support for table variable conversion to temporary tables and improved string handling with logic to convert double single quotes to double quotes
-
Fixed semicolon placement in nested select statements - Resolved issue where semicolons appeared before comments in nested select statements
-
Improved MS SQL procedure handling - Added LIMIT 1 for Set in select statements, enhanced function mappings, fixed string concatenation, and removed unsupported constraints
Reconcile
No updates in this release.
Documentation
No updates in this release.
Contributors: @gueniai, @sundarshankar89
v0.10.12
Analyzer
- New installation verification command - Introduced a new command to verify successful installation of the Lakebridge Analyzer, displaying usage and available flags for report file paths, source directories, and source technologies
Converters
General
-
Enhanced transpile command - Updated
transpilecommand to support--overrides-pathand--target-technologyarguments for greater flexibility and customization -
Improved error handling - Enhanced handling of parsing errors during code transpilation to output transpiled code instead of original input, providing clearer outcomes when issues arise
-
Refactored naming conventions - Renamed transpiler
product_nametotranspiler_idthroughout the codebase for improved consistency and clarity
Morpheus
TSQL
-
Enhanced TSQL support - Added support for DENY statements, EXEC statement syntax improvements, COLLATION in CREATE TABLE column definitions, and WINDOW clause functionality
-
Improved ALTER DATABASE support - Enhanced support for all options on ALTER DATABASE SET statements and multiple LOG file specifications in ALTER DATABASE ADD LOG
-
Better JOIN functionality - Added support for all join hints (MERGE, HASH, LOOP, REDUCE, REPLICATE, REDISTRIBUTE) in JOIN constructs
-
Enhanced COPY INTO support - Fixed syntax for COPY INTO commands and added extended column definitions support in TSQL mode
-
Improved DELETE operations - Added transformation rule to translate
INtoEXISTSwhen needed in DELETE statement WHERE clauses
Snowflake
-
COPY INTO improvements - Refactored and standardized grammar rules for COPY INTO commands, consolidating stage location handling
-
UPDATE FROM enhancements - Added tests for UPDATE FROM statements to verify correct transpilation to MERGE INTO statements
General
-
Enhanced permission handling - Added support for column-specific privileges and improved handling of column-specific permissions
-
Improved parser functionality - Allowed SCHEMAS keyword to be used as identifier and clarified warning messages for unrecognized functions
BladeBridge
MSSQL
-
Fixed update_to_merge functionality - Improved WITH clause handling and script variable ordering for MSSQL dialects
-
Table variable support - Implemented table variable conversion support for MSSQL dialects
-
DDL operation fixes - Fixed and removed unsupported DDL operations including alter index, switch partitions, and drop constraints
Informatica
-
Power Center improvements - Fixed hanging issue on Linux for Informatica PC conversion by improving block_subst patterns and output flushing
-
Dataframe implementation fixes - Fixed dataframe implementation for pulling data from flat file unconnected lookups in Informatica Power Center
DataStage
- TRUNCATE TABLE support - Added spark.sql_template to resolve TRUNCATE TABLE statement generation when TRUNCATE flag is enabled in DataStage
Reconcile
- Enhanced Databricks schema queries - Fixed Databricks schema query to improve accuracy and reliability of schema reconciliation, with better column name consistency and filtering
Documentation
-
Updated CLI documentation - Refreshed documentation to reflect latest changes in Command Line Interface menus, including new commands and flags such as
transpile,reconcile, andinstall-transpilesubcommands -
Enhanced command documentation - Added detailed documentation for transpile command usage and flags, including optional flags for catalog name, error file path, and source dialect
-
Updated installation guides - Modified installation documentation to include verification examples and updated help flags for new command options
Dependency updates: -
Updated cryptography requirement from <45.1.0,>=44.0.2 to >=44.0.2,<46.1.0 (#2028).
-
Bump databrickslabs/sandbox/acceptance@acceptance/v0.4.2 from 0.4.2 to 0.4.4 (#1833).
Contributors: @asnare, @sundarshankar89, @m-abulazm, @dependabot[bot], @gueniai
v0.10.11
Analyzer
No updates in this release
Converters
General
- Fixed special character handling in filenames by introducing from_uri() helper function for safer URI handling
- Ensured SQL converter returns UTF-8 encoded files for proper character encoding
- Fixed filename to correctly output databricks_conversion_supplements.py supplemental file
- Fixed broken splitter URL by updating directory naming conventions from "Downloads" to "downloads"
- Improved handling of encoding-related errors by catching UnicodeDecodeError and LookupError exceptions during file processing, creating TranspileError with specific encoding-error codes instead of stopping
Morpheus
Snowflake
- Added support for TRUNCATE TABLE statements with proper IR and translation support
- Correctly support $IDENTITY and $ROWGUID system variables
- Refactored and extended grammar and AST support for SQL procedure creation with improved handling of raw string literals
- Enhanced schema reconciliation functionality to support Snowflake arrays, addressing the corner case where Databricks arrays are typed and Snowflake arrays are untyped
TSQL
- Added support for TRUNCATE TABLE statements with proper IR and translation support
- Support full CREATE and ALTER INDEX statements in TSQL parsing, rejecting INDEX CREATE/ALTER statements sensibly instead of raising syntax errors
- Fixed implementation of IF scripting blocks with improvements to SQL parser, grammar enhancements, and enhanced scripting grammar for more robust handling of block statements and conditional branches
- Allow CLUSTERED to be an identifier to improve CREATE TABLE syntax as a CONSTRAINT qualifier
- Support percentage expressions in TSQL options (e.g., OPT = 42%) instead of raising parsing errors
- Added support for REVOKE statements, similar to existing GRANT statement implementation
- Ensure that ROWS and OBJECTS can be used as identifiers even with Jinja templates
- Correctly support $IDENTITY and $ROWGUID system variables
General (Multiple Dialects)
- Support comments on column declarations when generating SQL and renamed legacy builders for consistency
- Refactored IR around CREATE FUNCTION and CREATE PROCEDURE, unifying all ways to create stored procedures under a single CreateStoredProcedure IR node and all ways to create user defined functions under a single CreateUDF IR node
- Implemented grammar and IR placeholders for named windows, introducing initial support for the SQL standard WINDOW clause in parser grammar
BladeBridge
Oracle
- Removed unsupported Oracle DDL constraints (add/create constraint unique) and extraneous TBLPROPERTIES from converted output
MSSQL
- Added handle_xml_nodes function for MS SQL processing
- Fixed multiple MSSQL issues including CTEs in views/stored procedures, ADD CONSTRAINT problems, DEFAULT value handling, and parameter data types
Synapse
- Fixed multiple Synapse issues including CTEs in views/stored procedures, ADD CONSTRAINT problems, DEFAULT value handling, parameter data types, error handling in stored procedures, and Synapse-specific features (e.g., table distribution)
Teradata
- Added Teradata function mappings including ZEROIFNULL, TEMPORAL_TIMESTAMP, TRYCAST, ANY, FIRST, NULLIFZERO, DECODE with different parameter counts, and HASHAMP
- Removed collect statistics and lock table statements
DataStage
- Implemented DataStage Checksum component translation to SparkSQL equivalent and fixed Pyspark checksum translation to use MD5() instead of SHA2()
Reconcile
- Added handling for special characters in reconcile aggregate, enhancing the library to handle special characters in column names by properly delimiting identifiers in SQL queries
- Fixed deploy reconcile jobs by updating wheel file handling, simplifying deployment process to use single wheel path, and fixing broken documentation links
Documentation
- Fixed download link in docs (reconcile automation) by replacing broken markdown link with JSX link utilizing useBaseUrl hook
General
- Implemented new describe-transpile CLI subcommand that describes installed transpilers, including their versions, configuration paths, and supported source dialects
- Switched from urllib to requests library for making HTTP calls to PyPI and Maven Central, with default 60-second timeout and improved error handling
- Work around DATABRICKS_HOST normalization issue during install and uninstall by introducing new Lakebridge subclass with appropriate workspace client
Dependency updates
- Bump Databricks SDK Version to 0.67.0 by @goodwillpunning in #2062
- Bump sigstore/gh-action-sigstore-python from 3.0.0 to 3.0.1 by @dependabot[bot] in #1753
Special thanks to @BrianDeacon for his contribution to fix #1858
Contributors: @asnare, @m-abulazm, @ihor-ki, @goodwillpunning, @sundarshankar89, @dependabot[bot]
v0.10.10
Analyzer
- Large XML file chunking optimization: Now the analyzer is able to handle large XML files (up to 1TB in size)
Converters
General
- Non-interactive transpiler installation: Introduced support for non-interactive installation mode with new
interactiveoption that can detect environment context, enabling automated installations without user input while preserving existing configurations. Resolves #2013
Morpheus
-
Enhanced GRANT statement support: Implemented comprehensive GRANT statement support by creating dedicated
permission.g4grammar file with IR definitions and translation rules for permission-related statements -
Improved error handling: Rewrote print function to properly handle newlines and added extensive unit tests for error annotation, including block and FIXME comments. Resolves #2030
-
Enhanced LSP server behavior: Improved LSP server to append original text to error messages when transpilation fails, eliminating need for client-side response manipulation
-
Standardized dialect options: Aligned dialect options to present
synapseandmssqlto users for consistency with bladebridge -
Fixed Lateral Column Alias handling: Enhanced dealiasing for Lateral Column Aliases (LCAs) in WHERE clauses under CASE...WHEN expressions. Resolves #1767
-
Enhanced GROUP BY/aggregation function dealiasing: Implemented dealiasing for Lateral Column Aliases in GROUP BY clauses and aggregation functions where LCA references are unsupported. Resolves (#956) and (#954)
-
Optimized Snowflake transformations: Reordered transformation rules to ensure
TransformWithinGroupprocesses all cases before the call mapper. Resolves #1231
BladeBridge
-
Enhanced merge statement handlers: Improved merge statement processing to fix backtick handling, update operations without WHERE clauses, procedure conversions, IF-THEN-SET blocks, and various delimiter and mapping issues
-
Fixed view creation with WITH clauses: Corrected CREATE VIEW functionality to properly handle WITH clause statements
-
Oracle script improvements: Resolved variable declaration issues in Oracle scripts containing exception handling blocks
-
SQL Server function mapping: Added function mappings for Microsoft SQL Server functions including GETUTCDATE, IS_MEMBER, SERVERPROPERTY variants, and QUOTENAME with one or two arguments
-
Fixed variable declarations: Resolved variable declaration issues in Oracle scripts that contain exception handling blocks
-
MSSQL Server Enhanced function mappings: Added comprehensive function mappings including GETUTCDATE, IS_MEMBER, SERVERPROPERTY variants, and QUOTENAME with one or two arguments
Reconcile
-
Improved logging for aggregate reconciliation: Enhanced logging functionality to provide more accurate messages by replacing warning logs with informational messages when aggregate details rules are empty, indicating successful reconciliation with no details to store. Resolves #2040
-
Refactored aggregate query building: Simplified code using
AggregateQueryBuilderclass to generate queries for both source and target in a more concise and efficient manner
Documentation
No updates in this release
Dependency updates:
- Bump actions/setup-python from 5 to 6 (#1988).
Contributors: @m-abulazm, @asnare, @dependabot[bot], @sundarshankar89
v0.10.9
Analyzer
- Fixed bug where Analyzer would crash with large DDL files
- Adjusted calculation of complexity for TSQL queries to make it more accurate
Transpilers
Morpheus
-
T-SQL Updates
- Advanced Statement Support: Added parsing for
CREATE CERTIFICATE,CREATE LOGIN,PRINTcommands, andEXECUTE AS LOGINstatements - SET Command Enhancements: Support for complex assignment operators (
+=,-=,*=,/=,%=,&=,^=,|=) commonly used in T-SQL scripts - CREATE EXTERNAL TABLE: Improved parsing with flexible syntax for external table definitions and location specifications
- GRANT/REVOKE Statements: Comprehensive support for T-SQL security statements with clear Unity Catalog migration guidance
- DROP Commands: Enhanced handling of
DROP SENSITIVITYand other specialized DROP variants - Improved Error Reporting: SQL output now includes
FIXMEcomments with detailed error messages for unsupported constructs
- Advanced Statement Support: Added parsing for
-
Snowflake Updates
- Analytics Functions: Full parsing support for
MATCH_RECOGNIZEclause with pattern analysis capabilities for complex analytical queries - Time Travel Queries: Enhanced handling of
CHANGES,AT, andBEFOREclauses for historical data access patterns - REGEXP_INSTR Function: Complete implementation supporting all 7 parameters (vs Databricks' 2), providing accurate behavioral translation
- Table-Valued Functions: Support for parsing inline table-valued functions commonly used in Snowflake
- GRANT/REVOKE Statements: Full support for Snowflake's complex privilege management syntax including roles and shares
- DROP Commands: Enhanced parsing for
DROP SENSITIVITYand related data governance statements - Improved Error Reporting: SQL output now includes
FIXMEcomments with detailed error messages for unsupported constructs
- Analytics Functions: Full parsing support for
Dependency updates:
- Bump actions/checkout from 4 to 5 (#1928).
- Bump actions/upload-pages-artifact from 3 to 4 (#1964).
- Bump mermaid from 11.6.0 to 11.10.1 in /docs/lakebridge (#1956).
Contributors: @dependabot[bot], @asnare, @m-abulazm
v0.10.8
Transpilers
General
- SQL Validation Enhancement: Improved SQL validator to check only SQL outputs with enhanced error handling and support for various transpile results (#1949)
- Error Handling Improvements: Added static error lookups for specific cases like unresolved routines and columns, with more readable exception messages
- MIME Support: New functionality to support both MIME and non-MIME transpile results, including validation and output file management
- LSP Server Integration: Log level now passed to Language Server Protocol (LSP) server via environment variable for greater flexibility (#1967)
- Transpiler Auto-Upgrade: Enhanced installer to automatically upgrade existing Lakebridge transpilers during CLI upgrade process (#1978)
- Source Dialect Handling: Fixed missing transpile source dialect handling to ensure correct assignment in configuration objects (#1985)
Morpheus
-
Enhanced Snowflake Conversion support:
- Support for parsing ILIKE, EXCLUDE, REPLACE, RENAME with * LHS
- Full support for EXCLUDE and RENAME clauses and all combinations
- Fixed REPLACE function with optional third argument
- Enhanced OBJECT_DELETE to accept 2 or more arguments
- Accurate translation of Snowflake's REGEXP_REPLACE
-
Parser Improvements:
- Allow lists of generic options with optional commas
- EXTERNAL can now be used as an ID despite being documented as reserved
- Support for DROP RULE syntax in TSQL
- Allow DBT Jinja macros within JSON literals
- Fixed bugs around DBT elseif and comment nodes
-
Error Handling: Upgraded SimpleError with support status and simplified user-facing parse error messages
-
Integration Alignment: Updated error handling to align with BladeBridge, now returning
UNRESOLVED_ROUTINEerrors consistently (#1998)
BladeBridge
-
XML Source Processing:
- Automatic detection of XML sources with proper encoding preservation
- Maintains UTF-8 encoding while respecting XML-specific encoding declarations
- Prevents XML parser failures from encoding mismatches
-
SQL Scripting Enhancements:
- Fixed nested comment handling in SQL scripts
- Improved custom configuration handling for first-match processing
- Removed unnecessary begin/end enclosures in pre/post SQL blocks
-
Teradata Updates: Enhanced
convert_update_to_mergefunctionality -
Oracle Updates:
- Replaced list partitioning with
CLUSTER BYstatements - Removed unsupported
CREATE INDEXandALTER INDEXstatements - Fixed
CREATE PROCEDUREsignature generation with proper exception handling
- Replaced list partitioning with
-
DataStage Updates:
- Added support for
TRUNCATE TABLEspecifications (#1903) - Fixed column name handling when dataframe columns match job parameters
- Enabled single-pass processing of shared containers
- Resolved dataset component path issues for proper PySpark code generation
- Added support for
Reconcile
- Schema Normalization: Added feature flag for identifier normalization with optional
normalizeparameter inget_schemamethod for flexible handling of different data source configurations (#1953)
Enhanced Connection Support
- Snowflake Security: Added support for encrypted PEM private keys with
pem_private_key_passwordfield for secure authentication (#1869) - JDBC URL Handling: Improved JDBC URL arguments handling with enhanced error handling and logging
- Connection Properties: Enhanced SecretsMixin class with new
_get_secret_or_nonemethod for better secret value retrieval - Error Handling: Introduced new exceptions like
InvalidSnowflakePemPrivateKeyfor better error management
Documentation
Comprehensive Documentation Updates
- MS SQL and Synapse: Enhanced documentation for reconcile connections including default secret naming conventions and required connection properties (#1954)
- Connection Configuration: Added clear YAML format examples for MS SQL connection properties covering user, password, host, port, database, encryption, and trust server certificate
- BladeBridge Updates: Minor naming correction from "Microsoft MS SQL Server" to "Microsoft SQL Server" while maintaining support for Oracle, Teradata, Netezza, Informatica, and DataStage
- SQL Splitter: Updated documentation to remove RCT references, relocated to main menu with revised terminology using "Lakebridge" consistently (#1952)
- Transpiler Discovery: Updated documentation for pluggable transpiler discovery and execution, introducing Morpheus and BladeBridge as Databricks-provided transpilers
- Installation Process: Updated installation processes from Maven Central and PyPi with new directory structure for manual installations
General
Installation and Maintenance Improvements
- Automated Upgrades: Streamlined installation process with automatic transpiler upgrades during CLI upgrade, eliminating need for separate upgrade commands
- Plugin Management: Improved installation process for plugins like Bladebridge and Morpheus
- Testing Enhancement: Added comprehensive test functions to validate SQL file transpilation with various scenarios including table creation and error handling
Contributors: @m-abulazm, @asnare, @sundarshankar89, @goodwillpunning, @gueniai