You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
A dbt-spark project using the session connection method is unable to read Iceberg tables from Glue catalog due to a pyspark.sql.utils.AnalysisException with desc = SHOW TABLE EXTENDED is not supported for v2 tables .
I did some digging and I think the issue is related to the exception_handler in connections.py
In particular, this block:
except Exception as exc:
logger.debug("Error while running:\n{}".format(sql))
logger.debug(exc)
if len(exc.args) == 0:
raise
I've verified that my job is hitting the len(exc.args) == 0 condition, probably because I'm using the session connection method, but I haven't verified that.
I was able to work around this error in my local environment by raising a DbtRuntimeError with the desc from the orginal exception, instead of just re-raising the original exception itself.
Is there any reason this method should ever re-raise the original error instead of a DbtRuntimeError?
Expected Behavior
The pyspark.sql.utils.AnalysisException should have been wrapped in a DbtRuntimeError, and thus handled by the existing logic that checks for this specific error message to deal with Iceberg table metadata properly.
Steps To Reproduce
Run dbt-spark in a project configured with the session connection method
Run a model that reads an Iceberg table from Glue
Observe that the run fails due to a pyspark.sql.utils.AnalysisException
Relevant log output
20:04:36.994919 [error] [MainThread]: Encountered an error:
SHOW TABLE EXTENDED is not supported for v2 tables.;
ShowTableExtended *, [namespace#21, tableName#22, isTemporary#23, information#24]
+- ResolvedNamespace org.apache.iceberg.spark.SparkCatalog@50b0402d, [dbt_iceberg_db]
20:04:37.002637 [error] [MainThread]: Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/dbt/cli/requires.py", line 87, in wrapper
result, success = func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/cli/requires.py", line 72, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/cli/requires.py", line 143, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/cli/requires.py", line 172, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/cli/requires.py", line 219, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/cli/requires.py", line 259, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/cli/main.py", line 278, in docs_generate
results = task.run()
File "/opt/conda/lib/python3.10/site-packages/dbt/task/generate.py", line 206, in run
compile_results = CompileTask.run(self)
File "/opt/conda/lib/python3.10/site-packages/dbt/task/runnable.py", line 468, in run
result = self.execute_with_hooks(selected_uids)
File "/opt/conda/lib/python3.10/site-packages/dbt/task/runnable.py", line 428, in execute_with_hooks
self.before_run(adapter, selected_uids)
File "/opt/conda/lib/python3.10/site-packages/dbt/task/runnable.py", line 415, in before_run
self.populate_adapter_cache(adapter)
File "/opt/conda/lib/python3.10/site-packages/dbt/task/runnable.py", line 406, in populate_adapter_cache
adapter.set_relations_cache(self.manifest)
File "/opt/conda/lib/python3.10/site-packages/dbt/adapters/base/impl.py", line 473, in set_relations_cache
self._relations_cache_for_schemas(manifest, required_schemas)
File "/opt/conda/lib/python3.10/site-packages/dbt/adapters/base/impl.py", line 450, in _relations_cache_for_schemas
forrelationinfuture.result():
File "/opt/conda/lib/python3.10/concurrent/futures/_base.py", line 451, in result
returnself.__get_result()
File "/opt/conda/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/opt/conda/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/utils.py", line 465, in connected
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/adapters/spark/impl.py", line 213, in list_relations_without_caching
show_table_extended_rows = self.execute_macro(LIST_RELATIONS_MACRO_NAME, kwargs=kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/adapters/base/impl.py", line 1054, in execute_macro
result = macro_function(**kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/clients/jinja.py", line 330, in __call__
return self.call_macro(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/clients/jinja.py", line 257, in call_macro
return macro(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/jinja2/runtime.py", line 763, in __call__
return self._invoke(arguments, autoescape)
File "/opt/conda/lib/python3.10/site-packages/jinja2/runtime.py", line 777, in _invoke
rv = self._func(*arguments)
File "<template>", line 21, in macro
File "/opt/conda/lib/python3.10/site-packages/jinja2/sandbox.py", line 393, in call
return __context.call(__obj, *args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/jinja2/runtime.py", line 298, in call
return __obj(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/clients/jinja.py", line 330, in __call__
return self.call_macro(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/clients/jinja.py", line 257, in call_macro
return macro(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/jinja2/runtime.py", line 763, in __call__
return self._invoke(arguments, autoescape)
File "/opt/conda/lib/python3.10/site-packages/jinja2/runtime.py", line 777, in _invoke
rv = self._func(*arguments)
File "<template>", line 33, in macro
File "/opt/conda/lib/python3.10/site-packages/jinja2/sandbox.py", line 393, in call
return __context.call(__obj, *args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/jinja2/runtime.py", line 298, in call
return __obj(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/clients/jinja.py", line 330, in __call__
return self.call_macro(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/clients/jinja.py", line 257, in call_macro
return macro(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/jinja2/runtime.py", line 763, in __call__
return self._invoke(arguments, autoescape)
File "/opt/conda/lib/python3.10/site-packages/jinja2/runtime.py", line 777, in _invoke
rv = self._func(*arguments)
File "<template>", line 52, in macro
File "/opt/conda/lib/python3.10/site-packages/jinja2/sandbox.py", line 393, in call
return __context.call(__obj, *args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/jinja2/runtime.py", line 298, in call
return __obj(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/dbt/adapters/base/impl.py", line 290, in execute
return self.connections.execute(sql=sql, auto_begin=auto_begin, fetch=fetch, limit=limit)
File "/opt/conda/lib/python3.10/site-packages/dbt/adapters/sql/connections.py", line 146, in execute
_, cursor = self.add_query(sql, auto_begin)
File "/opt/conda/lib/python3.10/site-packages/dbt/adapters/sql/connections.py", line 80, in add_query
cursor.execute(sql, bindings)
File "/opt/conda/lib/python3.10/site-packages/dbt/adapters/spark/session.py", line 208, in execute
self._cursor.execute(sql)
File "/opt/conda/lib/python3.10/site-packages/dbt/adapters/spark/session.py", line 110, in execute
self._df = spark_session.sql(sql)
File "/opt/conda/lib/python3.10/site-packages/pyspark/sql/session.py", line 1034, in sql
return DataFrame(self._jsparkSession.sql(sqlQuery), self)
File "/opt/conda/lib/python3.10/site-packages/py4j/java_gateway.py", line 1321, in __call__
return_value = get_return_value(
File "/opt/conda/lib/python3.10/site-packages/pyspark/sql/utils.py", line 196, in deco
raise converted from None
pyspark.sql.utils.AnalysisException: SHOW TABLE EXTENDED is not supported for v2 tables.;
ShowTableExtended *, [namespace#21, tableName#22, isTemporary#23, information#24]
+- ResolvedNamespace org.apache.iceberg.spark.SparkCatalog@50b0402d, [dbt_iceberg_db]
github-actionsbot
changed the title
[Bug] Unable to read Iceberg tables when using session connection
[ADAP-802] [Bug] Unable to read Iceberg tables when using session connection
Aug 14, 2023
@joleyjol , so the issue that you raised is to wrap exception into DbtRuntimeError but not on the SHOW TABLE EXTENDED is not supported for v2 tables, right?
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
Is this a new bug in dbt-spark?
Current Behavior
A dbt-spark project using the session connection method is unable to read Iceberg tables from Glue catalog due to a
pyspark.sql.utils.AnalysisException
withdesc = SHOW TABLE EXTENDED is not supported for v2 tables
.I did some digging and I think the issue is related to the exception_handler in
connections.py
In particular, this block:
I've verified that my job is hitting the
len(exc.args) == 0
condition, probably because I'm using the session connection method, but I haven't verified that.I was able to work around this error in my local environment by raising a
DbtRuntimeError
with thedesc
from the orginal exception, instead of just re-raising the original exception itself.Is there any reason this method should ever re-raise the original error instead of a
DbtRuntimeError
?Expected Behavior
The
pyspark.sql.utils.AnalysisException
should have been wrapped in aDbtRuntimeError
, and thus handled by the existing logic that checks for this specific error message to deal with Iceberg table metadata properly.Steps To Reproduce
pyspark.sql.utils.AnalysisException
Relevant log output
Environment
Additional Context
No response
The text was updated successfully, but these errors were encountered: