Simple python script for monitoring GPUs on remote computing servers. Visdom is used as backend for visualization.
This script is used for our own purpose, not generally designed for other environments.
Install the required dependencies:
pip install -r requirements.txt
- Start Visdom server:
python -m visdom.server
- Run the GPU monitor:
python gpu_list.py --env GPUs
For standalone GPU status checking:
python py_gpu_status.py --mode memory # Get memory usage
python py_gpu_status.py --mode gpu # Get GPU utilization
python py_gpu_status.py --mode all # Get both
- Fixed critical scoping issues that prevented module imports
- Added proper error handling for subprocess calls and file operations
- Added division by zero protection for memory calculations
- Improved robustness for missing dependencies
- Added input validation and fallback values