-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: update readme with continuous system details #1050
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it 🔥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lekker 🎉
The base branch was changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some very minor things
- 🥑 **Implementations of MARL algorithms**: Implementations of multi-agent PPO systems that follow both the Centralised Training with Decentralised Execution (CTDE) and Decentralised Training with Decentralised Execution (DTDE) MARL paradigms. | ||
- 🍬 **Environment Wrappers**: Example wrappers for mapping Jumanji environments to an environment that is compatible with Mava. At the moment, we support [Robotic Warehouse][jumanji_rware] and [Level-Based Foraging][jumanji_lbf] with plans to support more environments soon. We have also recently added support for the SMAX environment from [JaxMARL][jaxmarl]. | ||
- 🥑 **Implementations of MARL algorithms**: Implementations of multi-agent PPO systems that follow both the Centralised Training with Decentralised Execution (CTDE) and Decentralised Training with Decentralised Execution (DTDE) MARL paradigms with support for continuous and discrete action space environments. | ||
- 🍬 **Environment Wrappers**: Example wrappers for mapping Jumanji environments to an environment that is compatible with Mava. At the moment, we support [Robotic Warehouse][jumanji_rware] and [Level-Based Foraging][jumanji_lbf] with plans to support more environments soon. We have also recently added support for the SMAX and MaBrax environments from [JaxMARL][jaxmarl]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while we're here can we add connector, cleaner, matrax, CVRP and gigastep? Maybe this should become a table of envs we support 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➕ Something like this with action type:
This table outlines the environments supported by our library ect....
| Library/Site | Environment Supported | Action Type |
|--------------|-----------------------|-------------|
| Jumanji | LBF | Discrete |
| | RWARE | Discrete |
| | CVRP | Discrete |
| Jaxmarl | MABRAX | Continuous |
| | SMAX | Continuous |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we will mention the different environments I think we need to add a comment that we are working on verifying the performance of mava on them because we may confirm that all works perfectly once we make a full benchmark 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: Since the connector
env is using a CNN network we may add another column citing the corresponding network for each env and mention that for action space we need to check the corresponding ActionHead as well in the network.yaml file:
| Library/Site | Environment Supported | Action Space | Network
|--------------|-----------------------|-------------|-------------
| Jumanji | LBF | Discrete | Default (mlp)
| | RWARE | Discrete | Default
| | Connector | Discrete | CNN
| Jaxmarl | MABRAX | Continuous | Default
| | SMAX | Continuous | Default
python mava/systems/ppo/ff_ippo.py env=rware env/scenario=tiny-4ag | ||
``` | ||
|
||
To toggle between continuous and discrete systems, simply select the continuous action space network head. To run the same system on an `MaBrax` environment make the follow config updates from the terminal: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make it clear that MaBrax is continuous
To toggle between continuous and discrete systems, simply select the continuous action space network head. To run the same system on an `MaBrax` environment make the follow config updates from the terminal: | |
To toggle between continuous and discrete systems, simply select the continuous action space network head. To run the same system on a continuous environment, like `MaBrax`, make the follow config updates from the terminal: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To toggle between continuous and discrete systems, simply select the continuous action space network head. To run the same system on an `MaBrax` environment make the follow config updates from the terminal: | |
To toggle between continuous and discrete systems, simply select the continuous action space network head. To run the same system on a `MaBrax` environment make the following config updates from the terminal: |
@@ -194,10 +200,8 @@ Please read our [contributing docs](docs/CONTRIBUTING.md) for details on how to | |||
We plan to iteratively expand Mava in the following increments: | |||
|
|||
- 🌴 Support for more environments. | |||
- 🔁 More robust recurrent systems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also close the issue Edan raised around this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➕ Sasha's suggestions
@@ -142,7 +142,7 @@ Furthermore, we illustrate the speed of Mava by showing the steps per second as | |||
|
|||
## Code Philosophy 🧘 | |||
|
|||
The current code in Mava is adapted from [PureJaxRL][purejaxrl] which provides high-quality single-file implementations with research-friendly features. In turn, PureJaxRL is inspired by the code philosophy from [CleanRL][cleanrl]. Along this vein of easy-to-use and understandable RL codebases, Mava is not designed to be a modular library and is not meant to be imported. Our repository focuses on simplicity and clarity in its implementations while utilising the advantages offered by JAX such as `pmap` and `vmap`, making it an excellent resource for researchers and practitioners to build upon. | |||
The current code in Mava is adapted from [PureJaxRL][purejaxrl] which provides high-quality single-file implementations with research-friendly features. In turn, PureJaxRL is inspired by the code philosophy from [CleanRL][cleanrl]. Along this vein of easy-to-use and understandable RL codebases, Mava is not designed to be a modular library and is not meant to be imported. Our repository focuses on simplicity and clarity in its implementations while utilising the advantages offered by JAX such as `pmap` and `vmap`, making it an excellent resource for researchers and practitioners to build upon. A notable difference between Mava and other single-file libraries is that Mava makes use of abstraction where relevant. In particular, this is done for network and environment creation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe network
-> neural network
or something clearer if exists
python mava/systems/ppo/ff_ippo.py env=rware env/scenario=tiny-4ag | ||
``` | ||
|
||
To toggle between continuous and discrete systems, simply select the continuous action space network head. To run the same system on an `MaBrax` environment make the follow config updates from the terminal: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To toggle between continuous and discrete systems, simply select the continuous action space network head. To run the same system on an `MaBrax` environment make the follow config updates from the terminal: | |
To toggle between continuous and discrete systems, simply select the continuous action space network head. To run the same system on a `MaBrax` environment make the following config updates from the terminal: |
What?
Update the readme to mention that we now have support for continuous action space environments.