The Customized Grid-World environment and actions
environment.py : Currently, the customized Grid-World of the 20x20 pixel window is configured.
Expert dataset 1,2 : Examples of configuring expert dataset with the pickle module
expert_generator.py : You can use this file to create expert data.
main.py : You can run this program by running main.py.
You should need expert data to find approximately 50 shortest paths.
This is a captured image executed from our old code.
150 episode
500 episode
You should need expert data to find approximately 200 shortest paths.
300 episode
500 episode
700 episode
900 episode
1000 episode
You should need expert data to find approximately 400-500 shortest paths.
700 episode
900 episode
1000 episode
-
[1] J. Ho, et al., "Generative Adversarial Imitation Learning", NIPS 2016.
-
[2] Xue Bin Peng, et al., "Variational Discriminator Bottleneck. Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow", ICLR 2019.
RL-korea : Dongmin Lee, et al.
Jungseob Lee / js-lee-AI / [email protected]