feat(BigQuery): Programmatic retries for a continuous query #13201

norlove · 2025-03-04T07:12:13Z

Description

Fixes #

Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.

Checklist

I have followed Sample Guidelines from AUTHORING_GUIDE.MD
README is updated to include all relevant information
Tests pass: nox -s py-3.9 (see Test Environment Setup)
Lint pass: nox -s lint (see Test Environment Setup)
These samples need a new API enabled in testing projects to pass (let us know which ones)
These samples need a new/updated env vars in testing projects set to pass (let us know which ones)
This sample adds a new sample directory, and I updated the CODEOWNERS file with the codeowners for this sample
This sample adds a new Product API, and I updated the Blunderbuss issue/PR auto-assigner with the codeowners for this sample
Please merge this PR for me once it is approved

gemini-code-assist

Hello @norlove, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request introduces programmatic retries for continuous queries in BigQuery. It adds a new Python script (programmatic-retries.py) that defines a Cloud Function to retry failed BigQuery queries. The function parses Pub/Sub messages containing log entries, extracts the failed SQL query, updates the timestamp in the query, and resubmits the query with a new job ID. Additionally, the pull request updates the requirements.txt and requirements-test.txt files to include necessary dependencies like functions-framework, google-cloud-bigquery, google-auth, and pytest.

Highlights

New Feature: Implements programmatic retries for BigQuery continuous queries using a Cloud Function triggered by Pub/Sub messages.
Timestamp Handling: Updates the timestamp in the failed SQL query to ensure the retried query processes data from the point of failure.
Dependency Management: Adds necessary dependencies to requirements.txt and requirements-test.txt for the Cloud Function to run correctly.

Changelog

bigquery/continuous-queries/programmatic-retries.py
- Introduces a new Cloud Function retry_bigquery_query to handle retries.
- Parses Pub/Sub messages to extract the failed SQL query and job ID.
- Updates the timestamp in the SQL query based on the previous job's end time.
- Resubmits the query with a new job ID using the BigQuery API.
- Includes error handling and logging for debugging.
bigquery/continuous-queries/requirements-test.txt
- Adds pytest==8.2.0 as a testing dependency.
bigquery/continuous-queries/requirements.txt
- Adds functions-framework==3.*, google-cloud-bigquery, and google-auth as dependencies for the Cloud Function.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Trivia time!

What is the purpose of UUIDs (Universally Unique Identifiers) as used in the code?

Click here for the answer

UUIDs are used to generate unique job IDs for the retried queries, preventing conflicts with previous job IDs and ensuring each retry is treated as a distinct job.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The code introduces a mechanism to retry failed BigQuery continuous queries by extracting information from log entries and resubmitting the query with an updated timestamp. This is a valuable addition to improve the resilience of continuous queries. The implementation appears well-structured, but there are a few areas that could benefit from refinement.

Summary of Findings

Error Handling and Logging: The error handling could be improved by providing more context-specific error messages and potentially implementing retry logic with exponential backoff to avoid overwhelming the system.
Timestamp Handling: The logic for updating timestamps in the SQL query could be made more robust to handle various timestamp formats and edge cases.
Security Considerations: Storing the service account directly in the code is not recommended. It should be managed more securely, potentially through environment variables or a secrets management system.

Merge Readiness

The code provides a functional solution for retrying failed BigQuery continuous queries. However, before merging, it's recommended to address the error handling, timestamp handling, and security considerations outlined in the review comments. I am unable to directly approve this pull request, and recommend that others review and approve this code before merging.

telpirion

This is a good start. There are some blocking issues that need to be addressed before we can merge.

telpirion · 2025-03-04T17:20:38Z

bigquery/continuous-queries/programmatic-retries.py

+#      - How to handle limiting retries or using exponential backoff.
+
+# Make sure you provide your SERVICE_ACCOUNT and CUSTOM_JOB_ID_PREFIX.
+


issue: include region tags.

See https://googlecloudplatform.github.io/samples-style-guide/#region-tags

Addressed via updated code.

telpirion · 2025-03-04T17:23:33Z

bigquery/continuous-queries/programmatic-retries.py

+            # Check if 'protoPayload' exists
+            if 'protoPayload' in log_entry:
+                # Extract the SQL query
+                sql_query = log_entry['protoPayload']['metadata']['jobChange']['job']['jobConfig']['queryConfig']['query']


issue: break this into multiple lines. This line is pretty long--I recommend putting the log_entry part of the expression between parentheses and moving it onto a subsequent line.

question: are we sure that this series of keys will always be present in the dict? It seems like there's a danger of a KeyError being raised.

Addressed via updated code.

telpirion · 2025-03-04T17:25:15Z

bigquery/continuous-queries/programmatic-retries.py

+                sql_query = log_entry['protoPayload']['metadata']['jobChange']['job']['jobConfig']['queryConfig']['query']
+
+            # Record Job ID that failed and will attempt to be restarted
+            failed_job_id = log_entry['protoPayload']['metadata']['jobChange']['job']['jobName']


See previous about a potential KeyError.

Addressed via updated code.

telpirion · 2025-03-04T17:25:35Z

bigquery/continuous-queries/programmatic-retries.py

+            print(f"Retrying failed job: {failed_job_id}")
+
+            # Extract the endTime from the log entry
+            end_timestamp = log_entry['protoPayload']['metadata']['jobChange']['job']['jobStats']['endTime']


See previous about a potential KeyError.

Addressed via updated code.

telpirion · 2025-03-04T17:29:40Z

bigquery/continuous-queries/programmatic-retries.py

+            log_entry = json.loads(base64.b64decode(event['data']).decode('utf-8'))
+
+            # Check if 'protoPayload' exists
+            if 'protoPayload' in log_entry:


issue: reduce cyclomatic complexity. It looks like there are a series of nested ifs in this code. So many code paths are hard to read and the harder it is to test.

Consider using the "Return Early Pattern" to reduce the number of branches.

See https://googlecloudplatform.github.io/samples-style-guide/#complexity.

I recommend switching to the

Addressed via updated code.

telpirion · 2025-03-04T18:35:38Z

bigquery/continuous-queries/programmatic-retries.py

+            access_token = credentials.token
+
+            # API endpoint
+            url = f"https://bigquery.googleapis.com/bigquery/v2/projects/{project}/jobs"


issue: use the BigQuery client library whenever possible.

Reference: https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.client.Client#google_cloud_bigquery_client_Client_create_job

Unfortunately the continuous queries flag isn't available yet in the BigQuery client. So right now this needs to be an API call. But we can change this in the future once the client has been updated.

snippet-bot · 2025-03-05T06:53:04Z

Here is the summary of changes.

You are about to add 6 region tags.

bigquery/continuous-queries/programmatic_retries.py:22, tag functions_bigquery_continuous_queries_programmatic_retry
bigquery/continuous-queries/programmatic_retries.py:41, tag functions_bigquery_retry_decode
bigquery/continuous-queries/programmatic_retries.py:46, tag functions_bigquery_retry_extract_query
bigquery/continuous-queries/programmatic_retries.py:67, tag functions_bigquery_retry_adjust_timestamp
bigquery/continuous-queries/programmatic_retries.py:84, tag functions_bigquery_retry_api_call
bigquery/continuous-queries/programmatic_retries.py:132, tag functions_bigquery_retry_handle_response

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

Refresh this comment

norlove · 2025-03-05T06:56:56Z

@telpirion I've gone ahead and updated the code. Can you please do another pass?

telpirion

Two last items then this is good-to-go:

Ensure that you include imports inside the sample region_tags.
Provide an integration test for this sample. Ping me if you have questions about how to best do this.

telpirion · 2025-03-06T18:24:22Z

bigquery/continuous-queries/programmatic-retries.py

+        # [START functions_bigquery_retry_decode]
+        # Decode and parse the Pub/Sub message data
+        log_entry = json.loads(base64.b64decode(event['data']).decode('utf-8'))
+        # [END functions_bigquery_retry_decode]


issue: include the import statements in the code sample.

See: https://googlecloudplatform.github.io/samples-style-guide/#imports

telpirion · 2025-03-06T18:24:54Z

bigquery/continuous-queries/programmatic-retries.py

+        # [END functions_bigquery_retry_extract_query]
+
+        # Check if required fields are missing
+        if not all([sql_query, failed_job_id, end_timestamp]):


praise: nice use of the Return Early Pattern! Thanks.

You can thank Gemini! LOL

norlove · 2025-03-07T05:28:46Z

@telpirion I've made the requested changes and added an integration test which I validated works successfully.

Back to you!

telpirion

Looks good! Thank you for your flexibility and responsiveness.

telpirion · 2025-03-11T16:14:46Z

Looks like all that's needed is to run black on this and it's ready to submit!

norlove added 7 commits March 3, 2025 22:41

Create README.md

ec91c12

Create programmatic-retries.py

26288ef

Delete bigquery/continuous-queries/retries directory

66b42bc

Update programmatic-retries.py

f67d467

Create requirements.txt

830fdf1

Update programmatic-retries.py

98a25f9

Create requirements-test.txt

ad53a4c

norlove requested review from chalmerlowe and a team as code owners March 4, 2025 07:12

product-auto-label bot added api: bigquery Issues related to the BigQuery API. samples Issues that are directly related to samples. labels Mar 4, 2025

gemini-code-assist bot reviewed Mar 4, 2025

View reviewed changes

blunderbuss-gcf bot assigned shollyman Mar 4, 2025

gemini-code-assist bot reviewed Mar 4, 2025

View reviewed changes

Update requirements.txt

77f2b79

telpirion self-assigned this Mar 4, 2025

telpirion requested changes Mar 4, 2025

View reviewed changes

glasnt added the waiting-response Waiting for the author's response. label Mar 5, 2025

Update programmatic-retries.py

b6a0a19

norlove requested a review from telpirion March 5, 2025 06:55

telpirion requested changes Mar 6, 2025

View reviewed changes

norlove added 6 commits March 6, 2025 14:29

Update programmatic-retries.py

8de02a3

Update requirements-test.txt

360c823

Rename programmatic-retries.py to programmatic_retries.py

b0761ce

Create programmatic_retries_test.py

a48c940

Update requirements-test.txt

ff504e5

Update requirements.txt

b44244a

norlove requested a review from telpirion March 7, 2025 05:29

Update programmatic_retries_test.py

280fd04

telpirion approved these changes Mar 7, 2025

View reviewed changes

telpirion and others added 3 commits March 7, 2025 09:54

Update programmatic_retries_test.py

d421cf5

Update programmatic_retries.py

045c85c

Update programmatic_retries_test.py

7e63a4e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(BigQuery): Programmatic retries for a continuous query #13201

feat(BigQuery): Programmatic retries for a continuous query #13201

norlove commented Mar 4, 2025

gemini-code-assist bot left a comment

gemini-code-assist bot left a comment

telpirion left a comment

telpirion Mar 4, 2025

norlove Mar 5, 2025

telpirion Mar 4, 2025

telpirion Mar 4, 2025

norlove Mar 5, 2025

telpirion Mar 4, 2025

norlove Mar 5, 2025

telpirion Mar 4, 2025

norlove Mar 5, 2025

telpirion Mar 4, 2025

norlove Mar 5, 2025

telpirion Mar 4, 2025

norlove Mar 5, 2025

snippet-bot bot commented Mar 5, 2025 •

edited

Loading

norlove commented Mar 5, 2025

telpirion left a comment

telpirion Mar 6, 2025

norlove Mar 6, 2025

telpirion Mar 6, 2025

norlove Mar 6, 2025

norlove commented Mar 7, 2025

telpirion left a comment

telpirion commented Mar 11, 2025

		# - How to handle limiting retries or using exponential backoff.

		# Make sure you provide your SERVICE_ACCOUNT and CUSTOM_JOB_ID_PREFIX.

feat(BigQuery): Programmatic retries for a continuous query #13201

Are you sure you want to change the base?

feat(BigQuery): Programmatic retries for a continuous query #13201

Conversation

norlove commented Mar 4, 2025

Description

Checklist

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

telpirion left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

snippet-bot bot commented Mar 5, 2025 • edited Loading

norlove commented Mar 5, 2025

telpirion left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

norlove commented Mar 7, 2025

telpirion left a comment

Choose a reason for hiding this comment

telpirion commented Mar 11, 2025

snippet-bot bot commented Mar 5, 2025 •

edited

Loading