-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
load_table_from_dataframe support JSON column dtype #1966
Comments
I have also tried uploading the data as dtype STRING in hopes that it would be converted to JSON server-side, but that also results in the following error: |
I was able to upload my data with the following code, but JSON support should be added for pandas dataframes. Code:
|
Thanks @jlynchMicron for providing a workaround. I think there are a few problems we'll need to work through, one of which is that the bigquery backend doesn't support JSON in load jobs from Parquet files: https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-parquet#type_conversions Please file a Feature Request https://cloud.google.com/support/docs/issue-trackers (specifically, Create new BigQuery issue). Another possible workaround is to use CSV as the source format. In your example
For googlers watching this issue, I have proposed a design go/bf-json which proposes a JSONDtype in https://github.com/googleapis/python-db-dtypes-pandas which would allow us to autodetect when JSON is used in a DataFrame. Before that though, we could do the same and choose the appropriate serialization format depending on the provided schema. For example, parquet must be used with STRUCT/ARRAY columns, but CSV must be used for JSON. |
googleapis/python-db-dtypes-pandas#284 can potentially fulfill this feature request. |
Thanks for stopping by to let us know something could be better!
PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.
Is your feature request related to a problem? Please describe.
python-bigquery does not seem to currently support uploading dataframes where one of the columns in the destination table is JSON dtype.
Quick partial code example:
Result:
google.api_core.exceptions.BadRequest: 400 Unsupported field type: JSON; reason: invalid, message: Unsupported field type: JSON
Describe the solution you'd like
Implement support for loading data to BigQuery that contain JSON columns.
Additional context
Related issues:
googleapis/python-bigquery-sqlalchemy#399
googleapis/python-bigquery-pandas#698
googleapis/python-bigquery-dataframes#816
The text was updated successfully, but these errors were encountered: