|
| 1 | +# GA-DDPG |
| 2 | + |
| 3 | +[[website](https://sites.google.com/view/gaddpg), [paper](https://arxiv.org/abs/2010.00824)] |
| 4 | + |
| 5 | + |
| 6 | + |
| 7 | + |
| 8 | +### Installation |
| 9 | + |
| 10 | +0. Setup: Ubuntu 16.04 or above, CUDA 10.0 or above, python 2.7 / 3.6 |
| 11 | + |
| 12 | +1. * (Required for Training) - Install [OMG](https://github.com/liruiw/OMG-Planner) submodule and reuse conda environment. |
| 13 | + * (Demo) - Install GA-DDPG inside a new conda environment |
| 14 | + ```angular2html |
| 15 | + conda create --name gaddpg python=3.6.9 |
| 16 | + conda activate gaddpg |
| 17 | + pip install -r requirements.txt |
| 18 | + ``` |
| 19 | +2. Install [PointNet++](https://github.com/liruiw/Pointnet2_PyTorch) |
| 20 | + |
| 21 | +3. Download environment data with ```bash experiments/scripts/download_data.sh ``` |
| 22 | + |
| 23 | + |
| 24 | +### Pretrained Model Demo |
| 25 | +0. Download pretrained models with ```bash experiments/scripts/download_model.sh ``` |
| 26 | +1. Demo model test ```bash experiments/scripts/test_demo.sh``` |
| 27 | + |
| 28 | +Example 1 | Example 2 |
| 29 | +:-------------------------:|:-------------------------: |
| 30 | +<img src="assets/demo.gif" width="224" height="224"/> | <img src="assets/demo3.gif" width="224" height="224"/> |
| 31 | + |
| 32 | +### Save Data and Offline Training |
| 33 | +0. Download example offline data with ```bash experiments/scripts/download_offline_data.sh ``` The .npz dataset would be saved in data/offline_data and can be loaded for training. |
| 34 | +1. To save extra gpus for online rollouts, use the offline training script ```bash ./experiments/scripts/train_offline.sh bc_aux_dagger.yaml BC``` |
| 35 | +2. Saving dataset ```bash ./experiments/scripts/train_online_save_buffer.sh bc_save_data.yaml BC```. |
| 36 | + |
| 37 | +### Online Training and Testing |
| 38 | +0. We use [ray](https://github.com/ray-project/ray) for parallel rollout and training. The training scripts might require adjustment according to the local machine. See ```config.py``` for some notes. |
| 39 | +1. Training online ```bash ./experiments/scripts/train_online_visdom.sh td3_critic_aux_policy_aux.yaml DDPG```. Use visdom and tensorboard to monitor. |
| 40 | +2. Testing on YCB objects with ```bash ./experiments/scripts/test_ycb.sh demo_model```. Replace demo_model with trained model. Logs and videos would be saved to ```output_misc``` |
| 41 | + |
| 42 | + |
| 43 | +### Note |
| 44 | +0. Checkout ```core/test_realworld_ros_final.py``` for an example of real-world usages. |
| 45 | +1. Related Works ([OMG](https://github.com/liruiw/OMG-Planner), [ACRONYM](https://github.com/NVlabs/acronym), [6DGraspNet](https://github.com/NVlabs/6dof-graspnet), [6DGraspNet-Pytorch](https://github.com/jsll/pytorch_6dof-graspnet), [ContactGraspNet](https://github.com/NVlabs/contact_graspnet)) |
| 46 | +2. Please use Github issue tracker to report bugs. For other questions please contact [Lirui Wang ](mailto:[email protected]). |
| 47 | + |
| 48 | + |
| 49 | +### File Structure |
| 50 | +```angular2html |
| 51 | +├── ... |
| 52 | +├── GADDPG |
| 53 | +| |── data # training data |
| 54 | +| | |── grasps # grasps from the ACRONYM dataset |
| 55 | +| | |── objects # object meshes, sdf, urdf, etc |
| 56 | +| | |── robots # robot meshes, urdf, etc |
| 57 | +| | └── gaddpg_scenes # test scenes |
| 58 | +| |── env # environment-related code |
| 59 | +| | |── panda_scene # environment and task |
| 60 | +| | └── panda_gripper_hand_camera # franka panda with gripper and camera |
| 61 | +| |── OMG # expert planner submodule |
| 62 | +| |── experiments # experiment scripts |
| 63 | +| | |── config # hyperparameters for training, testing and environment |
| 64 | +| | |── scripts # main running scripts |
| 65 | +| | |── model_spec # network architecture spec |
| 66 | +| | |── cfgs # experiment config and hyperparameters |
| 67 | +| | └── object_index # object indexes |
| 68 | +| |── core # agents and learning |
| 69 | +| | |── train_online # online training |
| 70 | +| | |── train_test_offline # testing and offline training |
| 71 | +| | |── network # network architecture |
| 72 | +| | |── agent # main agent code |
| 73 | +| | |── replay_memory # replay buffer |
| 74 | +| | |── trainer # ray-related training setup |
| 75 | +| | └── ... |
| 76 | +| |── real_world # real world experiment scripts |
| 77 | +| |── output # trained model |
| 78 | +| |── output_misc # log and videos |
| 79 | +| └── ... |
| 80 | +└── ... |
| 81 | +``` |
| 82 | + |
| 83 | +### Citation |
| 84 | +If you find GA-DDPG useful in your research, please consider citing: |
| 85 | +``` |
| 86 | +@inproceedings{wang2020goal, |
| 87 | + author = {Lirui Wang, Yu Xiang, Wei Yang, Arsalan Mousavian, and Dieter Fox}, |
| 88 | + title = {Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds}, |
| 89 | + booktitle = {arXiv:2010.00824}, |
| 90 | + year = {2020} |
| 91 | +} |
| 92 | +``` |
| 93 | + |
| 94 | +## License |
| 95 | +The GA-DDPG is licensed under the [MIT License](LICENSE). |
0 commit comments