In this project, we perform navigation in a Door & Key environment using Dynamic Programming. The agent operates in grid environments where it must reach the goal position, potentially unlocking doors by first collecting keys. The environments include known and random maps, with multiple doors and key locations.
The goal is to compute optimal policies, and we use backward dynamic programming to generate the required action sequences, giving an optimal path to the goal under given constraints and costs.
![]() Figure 1. Agent navigating in a known environment. |
![]() Figure 2. Agent navigating in a random environment. |
To set up the required environment and install the necessary packages, execute the following commands:
conda create -n <ENV_NAME> python=3.10
conda activate <ENV_NAME>
pip install -r requirements.txtThe project is organized as follows:
.
├── envs
│ ├── known_envs
│ └── random_envs
├── img
├── utils.py
├── DP.py
└── main.pyAll tests (known and random environments) run from a single file:
python3 main.py- Known and random environments are executed sequentially.
- Results (trajectories) are saved under
./results/.

