Skip to content

Improve modern GPU compatibility and document RTX 5090 validation#15

Open
XiaoBinGan wants to merge 1 commit into
FareedKhan-dev:mainfrom
XiaoBinGan:rtx5090-validation-and-compat
Open

Improve modern GPU compatibility and document RTX 5090 validation#15
XiaoBinGan wants to merge 1 commit into
FareedKhan-dev:mainfrom
XiaoBinGan:rtx5090-validation-and-compat

Conversation

@XiaoBinGan
Copy link
Copy Markdown

Summary

This PR improves modern GPU usability and documents a successful community validation run on an NVIDIA GeForce RTX 5090.

Changes included

  • add automatic device selection in config/config.py

    • uses CUDA when available
    • falls back to CPU otherwise
  • add lightweight runtime diagnostics in scripts/train_transformer.py

    • PyTorch version
    • CUDA version
    • configured device
    • GPU name
    • GPU capability
    • total VRAM
    • step time
    • throughput
    • elapsed time between eval intervals
    • peak VRAM allocated / reserved
    • checkpoint metadata for device / PyTorch / CUDA version
  • improve checkpoint loading compatibility in scripts/generate_text.py

    • supports newer PyTorch versions where torch.load behavior changed
  • update README.md

    • add community-validated RTX 5090 note
    • document a successful official README flow run for the 13M configuration
    • record tested environment and observed training behavior

Validation performed

Tested on:

  • GPU: NVIDIA GeForce RTX 5090
  • PyTorch: 2.11.0+cu128
  • CUDA: 12.8

Validated successfully:

  • official dataset download flow
  • official preprocessing flow
  • README-recommended 13M training flow
  • checkpoint saving
  • checkpoint loading
  • text generation

Notes

  • The RTX 5090 validation in this PR confirms the official 13M workflow.
  • Larger configurations are still hardware/config dependent and are not claimed as fully benchmarked here.

@XiaoBinGan
Copy link
Copy Markdown
Author

Tested locally on an RTX 5090 with PyTorch 2.11.0+cu128 and CUDA 12.8.

I followed the repo's official dataset download + preprocessing flow, switched to the README-recommended 13M configuration for validation, completed training successfully, saved a checkpoint, and verified checkpoint loading + text generation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant