Skip to content

Commit 7d975ab

Browse files
Kshitiz SharmaKshitiz Sharma
Kshitiz Sharma
authored and
Kshitiz Sharma
committed
Directly use ALE Interface rather than Gym
0 parents  commit 7d975ab

12 files changed

+838
-0
lines changed

.gitignore

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
# Custom
2+
*.bin
3+
chkpnt-*
4+
models/
5+
snapshot/
6+
chkpnts/
7+
snapshot.lock
8+
*.npy
9+
test.py
10+
pickle*
11+
memory_*
12+
logs/
13+
results/
14+
data/
15+
model.*
16+
events.*
17+
18+
# Byte-compiled / optimized / DLL files
19+
__pycache__/
20+
*.py[cod]
21+
*$py.class
22+
23+
*.pb
24+
*.tar.gz
25+
26+
# C extensions
27+
*.so
28+
29+
# Distribution / packaging
30+
.Python
31+
env/
32+
build/
33+
develop-eggs/
34+
dist/
35+
downloads/
36+
eggs/
37+
.eggs/
38+
lib/
39+
lib64/
40+
parts/
41+
sdist/
42+
var/
43+
wheels/
44+
*.egg-info/
45+
.installed.cfg
46+
*.egg
47+
48+
# PyInstaller
49+
# Usually these files are written by a python script from a template
50+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
51+
*.manifest
52+
*.spec
53+
54+
# Installer logs
55+
pip-log.txt
56+
pip-delete-this-directory.txt
57+
58+
# Unit test / coverage reports
59+
htmlcov/
60+
.tox/
61+
.coverage
62+
.coverage.*
63+
.cache
64+
nosetests.xml
65+
coverage.xml
66+
*.cover
67+
.hypothesis/
68+
69+
# Translations
70+
*.mo
71+
*.pot
72+
73+
# Django stuff:
74+
*.log
75+
local_settings.py
76+
77+
# Flask stuff:
78+
instance/
79+
.webassets-cache
80+
81+
# Scrapy stuff:
82+
.scrapy
83+
84+
# Sphinx documentation
85+
docs/_build/
86+
87+
# PyBuilder
88+
target/
89+
90+
# Jupyter Notebook
91+
.ipynb_checkpoints
92+
93+
# pyenv
94+
.python-version
95+
96+
# celery beat schedule file
97+
celerybeat-schedule
98+
99+
# SageMath parsed files
100+
*.sage.py
101+
102+
# dotenv
103+
.env
104+
105+
# virtualenv
106+
.venv
107+
venv/
108+
ENV/
109+
110+
# Spyder project settings
111+
.spyderproject
112+
.spyproject
113+
114+
# Rope project settings
115+
.ropeproject
116+
117+
# mkdocs documentation
118+
/site
119+
120+
# mypy
121+
.mypy_cache/
122+

README.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Human-Level Control through Deep Reinforcement Learning
2+
3+
Tensorflow implementation of [Human-Level Control through Deep Reinforcement Learning](http://home.uchicago.edu/~arij/journalclub/papers/2015_Mnih_et_al.pdf).
4+
5+
6+
This implementation contains:
7+
8+
1. DQN (Deep Q-Network) and DDQN (Double Deep Q-Network)
9+
2. Experience Replay Memory
10+
- to reduce the correlations between consecutive updates
11+
3. Network for Q-learning targets are fixed for intervals [OpenAI hack]
12+
- to reduce the correlations between target and predicted Q-values
13+
4. Image Cropping and Explicit Frame Skiping as in Original paper
14+
5. Support for Both Atari and Classic Control Environemnts from OpenAI gym
15+
16+
## Requirements
17+
18+
- Python 2.7 or Python 3.3+
19+
- Yaml
20+
- [gym](https://github.com/openai/gym)
21+
- [ALEInterface](https://github.com/mgbellemare/Arcade-Learning-Environment)
22+
- [OpenCV2](http://opencv.org/)
23+
- [TensorFlow](https://github.com/tensorflow/tensorflow)
24+
- [Tensorboard]
25+
26+
## Usage
27+
28+
1. First, install prerequisites with:
29+
30+
$ pip install gym[all]
31+
32+
2. Setup ALE Arcade-Learning-Environment
33+
- Install ALE from https://github.com/mgbellemare/Arcade-Learning-Environment:
34+
- Download ROM bin files from https://github.com/openai/atari-py/tree/master/atari_py/atari_roms:
35+
36+
$ wget https://github.com/openai/atari-py/raw/master/atari_py/atari_roms/breakout.bin
37+
38+
3. To train a model for Breakout(or Any Other Atari Games):
39+
- Edit the cfg/Atari.yml as required and run:
40+
41+
$ python main.py --rom <ROM bin file>
42+
43+
4. To test a model run:
44+
45+
$ python main.py --rom <ROM bin file> --mode test
46+
47+
48+
## Results
49+
50+
51+
52+
## TODOs
53+
- [x] Implement DQN
54+
- [x] Implement DDQN
55+
- [ ] Adaptive Exploration Rates
56+
- [ ] Implement DRQN
57+
- [ ] Prioritized Experience Replay
58+
- [ ] Add Detailed Results
59+
- [ ] Dueling Network Architectures
60+
61+
62+
63+
## References
64+
65+
- [DQN Tensorflow Implementation](https://github.com/carpedm20/deep-rl-tensorflow)
66+
- [DQN Tensorflow Implementation](https://github.com/devsisters/DQN-tensorflow)
67+
- [Code for Human-level control through deep reinforcement learning](https://sites.google.com/a/deepmind.com/dqn/)
68+
69+
70+
## License
71+
72+
MIT License.

0 commit comments

Comments
 (0)