- Generate a new Python virtual environment with Python 3.8 using
conda create -n myenv python=3.8
. - For the best performance, we recommend using NVIDIA driver version 525
sudo apt install nvidia-driver-525
. The minimal driver version supported is 515. If you're unable to install version 525, ensure that your system has at least version 515 to maintain basic functionality. - Install PyTorch 1.13 with Cuda-11.7:
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
- Install numpy-1.23 with
conda install numpy=1.23
. - Install Isaac Gym:
- Download and install Isaac Gym Preview 4 from https://developer.nvidia.com/isaac-gym.
cd isaacgym/python && pip install -e .
- Run an example with
cd examples && python 1080_balls_of_solitude.py
. - Consult
isaacgym/docs/index.html
for troubleshooting.
- Install humanoid-gym:
- Clone this repository.
pip install -e .
# Launching PPO Policy Training for 'v1' Across 4096 Environments
# This command initiates the PPO algorithm-based training for the humanoid task.
python scripts/train.py --task=DM_ppo --run_name v1 --headless --num_envs 4096
# Evaluating the Trained PPO Policy 'v1'
# This command loads the 'v1' policy for performance assessment in its environment.
# Additionally, it automatically exports a JIT model, suitable for deployment purposes.
python scripts/play.py --task=humanoid_ppo --run_name v1
# Implementing Simulation-to-Simulation Model Transformation
# This command facilitates a sim-to-sim transformation using exported 'v1' policy.
python scripts/sim2sim.py --load_model /path/to/logs/DM_ppo/exported/policies/policy_1.pt
# Run our trained policy
python scripts/sim2sim.py --load_model /path/to/logs/DM_ppo/exported/policies/policy_example.pt
-
humanoid_ppo
- Purpose: Baseline, PPO policy, Multi-frame low-level control
- Observation Space: Variable
$(47 \times H)$ dimensions, where$H$ is the number of frames $[O_{t-H} ... O_t]$ - Privileged Information:
$73$ dimensions
-
humanoid_dwl (coming soon)
- Training Command: For training the PPO policy, execute:
python humanoid/scripts/train.py --task=DM_ppo --load_run log_file_path --name run_name
- Running a Trained Policy: To deploy a trained PPO policy, use:
python humanoid/scripts/play.py --task=DM_ppo --load_run log_file_path --name run_name
- By default, the latest model of the last run from the experiment folder is loaded. However, other run iterations/models can be selected by adjusting
load_run
andcheckpoint
in the training config.
- Please note: Before initiating the sim-to-sim process, ensure that you run
play.py
to export a JIT policy. - Mujoco-based Sim2Sim Deployment: Utilize Mujoco for executing simulation-to-simulation (sim2sim) deployments with the command below:
python scripts/sim2sim.py --load_model /path/to/export/model.pt
- CPU and GPU Usage: To run simulations on the CPU, set both
--sim_device=cpu
and--rl_device=cpu
. For GPU operations, specify--sim_device=cuda:{0,1,2...}
and--rl_device={0,1,2...}
accordingly. Please note thatCUDA_VISIBLE_DEVICES
is not applicable, and it's essential to match the--sim_device
and--rl_device
settings. - Headless Operation: Include
--headless
for operations without rendering. - Rendering Control: Press 'v' to toggle rendering during training.
- Policy Location: Trained policies are saved in
humanoid/logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt
.
For RL training, please refer to humanoid/utils/helpers.py#L161
.
For the sim-to-sim process, please refer to humanoid/scripts/sim2sim.py#L169
.
- Every environment hinges on an
env
file (legged_robot.py
) and aconfiguration
file (legged_robot_config.py
). The latter houses two classes:LeggedRobotCfg
(encompassing all environmental parameters) andLeggedRobotCfgPPO
(denoting all training parameters). - Both
env
andconfig
classes use inheritance. - Non-zero reward scales specified in
cfg
contribute a function of the corresponding name to the sum-total reward. - Tasks must be registered with
task_registry.register(name, EnvClass, EnvConfig, TrainConfig)
. Registration may occur withinenvs/__init__.py
, or outside of this repository.
The base environment legged_robot
constructs a rough terrain locomotion task. The corresponding configuration does not specify a robot asset (URDF/ MJCF) and no reward scales.
- If you need to add a new environment, create a new folder in the
envs/
directory with a configuration file named<your_env>_config.py
. The new configuration should inherit from existing environment configurations. - If proposing a new robot:
- Insert the corresponding assets in the
resources/
folder. - In the
cfg
file, set the path to the asset, define body names, default_joint_positions, and PD gains. Specify the desiredtrain_cfg
and the environment's name (python class). - In the
train_cfg
, set theexperiment_name
andrun_name
.
- Insert the corresponding assets in the
- If needed, create your environment in
<your_env>.py
. Inherit from existing environments, override desired functions and/or add your reward functions. - Register your environment in
humanoid/envs/__init__.py
. - Modify or tune other parameters in your
cfg
orcfg_train
as per requirements. To remove the reward, set its scale to zero. Avoid modifying the parameters of other environments! - If you want a new robot/environment to perform sim2sim, you may need to modify
humanoid/scripts/sim2sim.py
:- Check the joint mapping of the robot between MJCF and URDF.
- Change the initial joint position of the robot according to your trained policy.
Observe the following cases:
# error
ImportError: libpython3.8.so.1.0: cannot open shared object file: No such file or directory
# solution
# set the correct path
export LD_LIBRARY_PATH="~/miniconda3/envs/your_env/lib:$LD_LIBRARY_PATH"
# OR
sudo apt install libpython3.8
# error
AttributeError: module 'distutils' has no attribute 'version'
# solution
# install pytorch 1.12.0
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
# error, results from libstdc++ version distributed with conda differing from the one used on your system to build Isaac Gym
ImportError: /home/roboterax/anaconda3/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.20` not found (required by /home/roboterax/carbgym/python/isaacgym/_bindings/linux64/gym_36.so)
# solution
mkdir ${YOUR_CONDA_ENV}/lib/_unused
mv ${YOUR_CONDA_ENV}/lib/libstdc++* ${YOUR_CONDA_ENV}/lib/_unused
Please cite the following if you use this code or parts of it:
@article{gu2024humanoid,
title={Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer},
author={Gu, Xinyang and Wang, Yen-Jen and Chen, Jianyu},
journal={arXiv preprint arXiv:2404.05695},
year={2024}
}
@inproceedings{gu2024advancing,
title={Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning},
author={Gu, Xinyang and Wang, Yen-Jen and Zhu, Xiang and Shi, Chengming and Guo, Yanjiang and Liu, Yichen and Chen, Jianyu},
booktitle={Robotics: Science and Systems},
year={2024},
url={https://enriquecoronadozu.github.io/rssproceedings2024/rss20/p058.pdf}
}
The implementation of Humanoid-Gym relies on resources from legged_gym and rsl_rl projects, created by the Robotic Systems Lab. We specifically utilize the LeggedRobot
implementation from their research to enhance our codebase.