-
Notifications
You must be signed in to change notification settings - Fork 81
Description
Is there an existing issue for this?
- I have searched the existing issues
Category of Bug / Issue
Application crashed
Current Behavior
Hello Lakebridge Team,
I have been attempting to implement Lakebridge's Reconciliation into our SQL Server to Databricks platform migration and have been facing issues in regards to Lakebridge determining datatypes of target (Databricks) columns.
Here is sample output in lakebridge_metadata.reconciliation.details
You see in the image that source columns datetypes are determinable but not vice versa.
Additionally, Lakebridge is unable to correctly determine mismatched rows, validating it's output of mismatched rows in source and target tables. I suspected it was because of the initial datatypes issues (makes sense, if the datatypes are unable to match, then the values won't match when comparing row-to-row by source and target). However, I noticed when running Lakebridge and setting ReconcileConfig.report_type = "data", still the mismatching rows error persists.
Additionally, in the "Steps to Reproduce", I've attached a .ipynb file for the Databricks notebook I am running. You will notice that the TableRecon.tables attribute has been set, if this attribute is not set, then columns are not determined in the source.
For further information, please contact me through my email: [email protected]
Expected Behavior
Expecting that Databricks Delta Tables' column datatypes to be determinable. This would lead to correct column datatypes matching, correct conversion from SQL Server to Databricks.
Steps To Reproduce
Source Table Details in SQL Server:
"GL_Accounts"

Target Table Details in SQL Server:
"GL_Accounts"
CREATE STREAMING TABLE <catalog>.<schema>.`gl_accounts` (
GLAcct_Key DECIMAL(15,0),
GLAcct_GLAcctMaj_Key DECIMAL(15,0),
GLAcct_BusEnt_Acct STRING,
GLAcct_Site_Acct STRING,
GLAcct_PCtr_Acct STRING,
GLAcct_Disabled SMALLINT,
GLAcct_Disable_Date TIMESTAMP,
GLAcct_Disable_User STRING,
ts BINARY) TBLPROPERTIES (
'__cdc_last_validated_schema_version' = '0',
'__cdc_reactivated_columns_since_schema_version' = '{}',
'__ingestion_connector_inactive_columns' = '[]',
'delta.columnMapping.mode' = 'name',
'delta.minReaderVersion' = '2',
'delta.minWriterVersion' = '5') AS
Note: table catalog and schema have been blurred.
How Lakebridge is being installed, initialized and ran (Note: table catalog and schema have been blurred):
Lakebridge Testing - Using JDBC Connection - ATOMIC_PDIENT_CND (1).ipynb
Relevant log output or Exception details
Logs Confirmation
- I ran the command line with
--debug - I have attached the
lsp-server.logunder USER_HOME/.databricks/labs/remorph-transpilers/<converter_name>/lib/lsp-server.log
Sample Query
Operating System
Windows
Version
latest via Databricks CLI