Skip to content

Conversation

@klei22
Copy link
Collaborator

@klei22 klei22 commented Oct 19, 2025

This pull request introduces a new feature to the get_parquet_dataset.py utility, allowing users to download and convert parquet files to JSON without emitting the results to a text file. This is achieved by adding a --skip_emit command-line option and updating the processing logic to honor this flag.

New feature: selective emission of JSON contents

  • Added a --skip_emit argument to the CLI, which lets users skip emitting JSON contents to the output text file, enabling download and conversion only.
  • Updated the signature of the main function and its invocation to accept the new skip_emit parameter. [1] [2]
  • Modified logic so that if skip_emit is set, the output text file is not cleared and JSON contents are not emitted during processing. [1] [2]

@klei22 klei22 requested review from Copilot and gkielian October 19, 2025 05:23
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new --skip_emit command-line option to the get_parquet_dataset.py utility, allowing users to download and convert parquet files to JSON without writing the JSON contents to a text output file.

  • Added --skip_emit CLI argument with appropriate help text
  • Modified the main function signature to accept the new skip_emit parameter
  • Updated processing logic to conditionally skip file clearing and JSON emission when the flag is set

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant