Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

504 Gateway Timeout when uploading large dataset to Hugging Face Hub #7400

Open
hotchpotch opened this issue Feb 14, 2025 · 4 comments
Open

Comments

@hotchpotch
Copy link

Description

I encountered consistent 504 Gateway Timeout errors while attempting to upload a large dataset (approximately 500GB) to the Hugging Face Hub. The upload fails during the process with a Gateway Timeout error.

I will continue trying to upload. While it might succeed in future attempts, I wanted to report this issue in the meantime.

Reproduction

  • I attempted the upload 3 times
  • Each attempt resulted in the same 504 error during the upload process (not at the start, but in the middle of the upload)
  • Using dataset.push_to_hub() method

Environment Information

- huggingface_hub version: 0.28.0
- Platform: Linux-6.8.0-52-generic-x86_64-with-glibc2.39
- Python version: 3.11.10
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Running in Google Colab Enterprise ?: No
- Token path ?: /home/hotchpotch/.cache/huggingface/token
- Has saved token ?: True
- Who am I ?: hotchpotch
- Configured git credential helpers: store
- FastAI: N/A
- Tensorflow: N/A
- Torch: 2.5.1
- Jinja2: 3.1.5
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: 10.4.0
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: 1.26.4
- pydantic: 2.10.6
- aiohttp: 3.11.11
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /home/hotchpotch/.cache/huggingface/hub
- HF_ASSETS_CACHE: /home/hotchpotch/.cache/huggingface/assets
- HF_TOKEN_PATH: /home/hotchpotch/.cache/huggingface/token
- HF_STORED_TOKENS_PATH: /home/hotchpotch/.cache/huggingface/stored_tokens
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10

Full Error Traceback

Traceback (most recent call last):
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/.venv/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 406, in hf_raise_for_status
    response.raise_for_status()
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/.venv/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: Gateway Time-out for url: https://huggingface.co/datasets/hotchpotch/fineweb-2-edu-japanese.git/info/lfs/objects/batch

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/create_edu_japanese_ds/upload_edu_japanese_ds.py", line 12, in <module>
    ds.push_to_hub("hotchpotch/fineweb-2-edu-japanese", private=True)
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/.venv/lib/python3.11/site-packages/datasets/dataset_dict.py", line 1665, in push_to_hub
    split_additions, uploaded_size, dataset_nbytes = self[split]._push_parquet_shards_to_hub(
                                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/.venv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 5301, in _push_parquet_shards_to_hub
    api.preupload_lfs_files(
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/.venv/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 4215, in preupload_lfs_files
    _upload_lfs_files(
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/.venv/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/.venv/lib/python3.11/site-packages/huggingface_hub/_commit_api.py", line 395, in _upload_lfs_files
    batch_actions_chunk, batch_errors_chunk = post_lfs_batch_info(
                                              ^^^^^^^^^^^^^^^^^^^^
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/.venv/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/.venv/lib/python3.11/site-packages/huggingface_hub/lfs.py", line 168, in post_lfs_batch_info
    hf_raise_for_status(resp)
  File "/home/hotchpotch/src/github.com/hotchpotch/fineweb-2-edu-classifier-japanese/.venv/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 477, in hf_raise_for_status
    raise _format(HfHubHTTPError, str(e), response) from e
huggingface_hub.errors.HfHubHTTPError: 504 Server Error: Gateway Time-out for url: https://huggingface.co/datasets/hotchpotch/fineweb-2-edu-japanese.git/info/lfs/objects/batch
@Wauplin Wauplin transferred this issue from huggingface/huggingface_hub Feb 14, 2025
@Wauplin
Copy link
Contributor

Wauplin commented Feb 14, 2025

I transferred to the datasets repository. Is there any retry mechanism in datasets @lhoestq ?

Another solution @hotchpotch if you want to get your dataset pushed to the Hub in a robust way is to save it to a local folder first and then use huggingface-cli upload-large-folder (see https://huggingface.co/docs/huggingface_hub/guides/upload#upload-a-large-folder). It has better retry mechanism in case of failure.

@lhoestq
Copy link
Member

lhoestq commented Feb 14, 2025

There is no retry mechanism for api.preupload_lfs_files in push_to_hub() but we can definitely add one here

api.preupload_lfs_files(

@hotchpotch
Copy link
Author

@Wauplin

Thank you! I believe that to use load_dataset() to read data from Hugging Face, we need to first save the markdown metadata and parquet files in our local filesystem, then upload them using upload-large-folder. If you know how to do this, could you please let me know?

@hotchpotch
Copy link
Author

@lhoestq

I see, so adding a retry mechanism there would solve it. If I continue to have issues, I'll consider implementing that kind of solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants