Refactor reward logging to use running totals #124

nathandsouza10 · 2023-11-29T19:39:29Z

Removed list-based reward accumulation
Implemented running total/count for individual and group rewards
Streamlined TensorBoard logging to use calculated averages

- Removed list-based reward accumulation - Implemented running total/count for individual and group rewards - Streamlined TensorBoard logging to use calculated averages

jaredvann · 2023-12-05T14:27:58Z

@nathandsouza10 Hi, thank you for taking the time to create this PR. We are not completely clear on what the overall benefits of these changes are. Is the goal optimisation of memory usage from not needing to store lists of reward values?

nathandsouza10 · 2023-12-05T15:55:47Z

@nathandsouza10 Hi, thank you for taking the time to create this PR. We are not completely clear on what the overall benefits of these changes are. Is the goal optimisation of memory usage from not needing to store lists of reward values?

Hi, thank you for considering my PR. Yes, the primary goal of these changes, albeit small, is to optimize memory usage. By moving away from storing extensive lists of reward values and implementing a running total and count for each agent, we significantly reduce the memory overhead. This is particularly beneficial in scenarios with many agents or long-running environments.

jaredvann · 2023-12-11T10:12:22Z

Thanks! Please see one small comment below.

To be able to contribute to this repo and merge this PR you will need to first sign the JPMorgan Chase Contributor License Agreement (CLA). This can be found at https://github.com/jpmorganchase/.github/blob/main/jpmc-cla-20230406.md. Instructions can be found at the bottom of the file.

Please let me know if you have any questions.

phantom/trainer.py

nathandsouza10 · 2023-12-11T14:24:22Z

any questions.

Thank you for guiding me through the contribution process. I wanted to inform you that I have already completed the submission of the JPMorgan Chase Contributor License Agreement (CLA). This was done on Friday, December 5th, and I sent it to [email protected]. Please do let me know if there is anything else I need to do or if there are any further steps to finalize my contribution. I’m looking forward to actively participating in the project!

leoardon · 2023-12-11T14:49:54Z

@nathandsouza10 Thanks for raising the request to contribute, it might need some time to be reviewed and accepted. We will follow up on this.
Could you please make the change mentioned and make sure it works as expected?

Following review feedback, the explicit initialisation of logged_rewards[agent_id] has been removed.

leoardon · 2023-12-12T12:45:54Z

@nathandsouza10 Good news, I see your CLA request has gone through and you're now an official contributor! 🎉 ✨

Could you also add unit tests to make sure your changes are working please? I see for example that the logged_rewards are still initialized as a list in the __init__ method l.89.

Refactor reward logging to use running totals

f8d7a7d

- Removed list-based reward accumulation - Implemented running total/count for individual and group rewards - Streamlined TensorBoard logging to use calculated averages

leoardon requested review from leoardon and jaredvann December 11, 2023 11:07

jaredvann suggested changes Dec 11, 2023

View reviewed changes

phantom/trainer.py Outdated Show resolved Hide resolved

Merge branch 'develop' into refactor/optimize-reward-logging

550ed21

nathandsouza10 added 2 commits December 11, 2023 16:37

Remove redundant initialization of logged_rewards

aefaa08

Following review feedback, the explicit initialisation of logged_rewards[agent_id] has been removed.

Format code using black for workflow compliance

f140ad6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor reward logging to use running totals #124

Refactor reward logging to use running totals #124

nathandsouza10 commented Nov 29, 2023

jaredvann commented Dec 5, 2023

nathandsouza10 commented Dec 5, 2023

jaredvann commented Dec 11, 2023 •

edited

Loading

nathandsouza10 commented Dec 11, 2023

leoardon commented Dec 11, 2023

leoardon commented Dec 12, 2023

Refactor reward logging to use running totals #124

Are you sure you want to change the base?

Refactor reward logging to use running totals #124

Conversation

nathandsouza10 commented Nov 29, 2023

jaredvann commented Dec 5, 2023

nathandsouza10 commented Dec 5, 2023

jaredvann commented Dec 11, 2023 • edited Loading

nathandsouza10 commented Dec 11, 2023

leoardon commented Dec 11, 2023

leoardon commented Dec 12, 2023

jaredvann commented Dec 11, 2023 •

edited

Loading