Stand and Wave

Train a humanoid to stand up and wave using reinforcement learning with GPU acceleration.

Project Overview

This project implements a reinforcement learning solution to teach a humanoid robot two sequential tasks:

Stand up and maintain balance
Wave with one arm while maintaining a standing position

The implementation leverages:

dm_control: For the physics-based humanoid environment
stable-baselines3: For the Proximal Policy Optimization (PPO) algorithm
Gymnasium: For standardized environment interfaces
GPU Acceleration: For faster training with parallel environments

Installation

# Clone the repository
git clone https://github.com/andrewtkent/humanoid-wave-rl.git
cd humanoid-wave-rl

# Create a virtual environment called 'wave'
python -m venv wave

# Activate the virtual environment
source wave/bin/activate

# Upgrade pip
pip install --upgrade pip

# Install dependencies
pip install -r requirements.txt

Usage

Training

# Train with default parameters (16 parallel environments)
python main.py

# Train with custom settings
xvfb-run -a python main.py --total_timesteps 3000000 --num_envs 24 --wandb --device cuda

# Resume Training
xvfb-run -a python main.py --resume_from results/ppo_humanoid_20250415_162622/checkpoints/humanoid_final.zip --total_timesteps 5000000 --num_envs 325 --wandb --wandb_id gallant-energy-44 --device cpu

Evaluation

# Evaluate and record video of the trained model
python main.py --mode evaluate --model_path results/humanoid_wave_final.zip

Video

For headless systems (which is the recommended approach for servers without a display):

# Generate and save video of the trained model using xvfb for headless rendering
xvfb-run -a python render_video.py --model_path results/humanoid_stand_final.zip --output_path output/humanoid_video.mp4

Note: If you encounter the following error:

ValueError: Could not find a backend to open `output/humanoid_video.mp4`` with iomode `wI`.
Based on the extension, the following plugins might add capable backends:
  FFMPEG:  pip install imageio[ffmpeg]
  pyav:  pip install imageio[pyav]

Install the required backend with:

pip install imageio[ffmpeg]

If you have a display available, you can run without xvfb, but this is not recommended for most server environments:

python render_video.py --model_path results/humanoid_stand_final.zip --output_path output/humanoid_video.mp4

Performance Optimization

The implementation includes several optimizations for better performance:

Vectorized Environments: Runs multiple environments in parallel to maximize throughput
GPU Acceleration: Uses CUDA for neural network computation when available
Optimized Hyperparameters: Batch size and network architecture tuned for GPU performance

Approach

Environment Wrapper

The project includes a custom wrapper that converts the dm_control humanoid environment to be compatible with stable-baselines3, including:

Converting observations to a flat vector
Handling step and reset functions
Implementing the reward shaping for waving behavior

Reward Design

The reward function combines:

Standing Reward: The original reward from dm_control that focuses on height and balance
Wave Reward: A custom reward that encourages arm movement
- Tracks specific arm joint positions
- Rewards oscillatory movement when the arm is elevated
- Uses a sinusoidal pattern as the target motion
Time-Based Reward: A reward component that increases linearly with standing duration
- Encourages sustained stability over time
- Caps at a maximum value to avoid overwhelming other reward components

The arm waving component is implemented to recognize right arm joints based on their names or positions, then rewards the agent for matching a sinusoidal wave motion while maintaining standing balance. The standing component remains the primary objective, with the arm waving as a secondary task that becomes more achievable once stable standing is mastered.

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
results		results
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
render_video.py		render_video.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stand and Wave

Project Overview

Installation

Usage

Training

Evaluation

Video

Performance Optimization

Approach

Environment Wrapper

Reward Design

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

AndrewTKent/humanoid-wave-rl

Folders and files

Latest commit

History

Repository files navigation

Stand and Wave

Project Overview

Installation

Usage

Training

Evaluation

Video

Performance Optimization

Approach

Environment Wrapper

Reward Design

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages