- 20 actors with 1 learner.
- Tensorflow implementation with
distributed tensorflow
of server-client architecture. Recurrent Experience Replay in Distributed Reinforcement Learning
is implemented in Breakout-Deterministic-v4 with POMDP(Observation not provided with 20% probability)
opencv-python
gym[atari]
tensorboardX
tensorflow==1.14.0
- Asynchronous Methods for Deep Reinforcement Learning
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
- DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
- Recurrent Experience Replay in Distributed Reinforcement Learning
- A3C: Asynchronous Methods for Deep Reinforcement Learning
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 19
- Ape-x: DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
python train_apex.py --job_name learner --task 0
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 19
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
python train_impala.py --job_name learner --task 0
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 19
- R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning
python train_r2d2.py --job_name learner --task 0
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 39
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
- DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
- Recurrent Experience Replay in Distributed Reinforcement Learning
- deepmind/scalable_agent
- google-research/seed-rl
- Asynchronous_Advatnage_Actor_Critic
- Relational_Deep_Reinforcement_Learning
- Deep Recurrent Q-Learning for Partially Observable MDPs