[Bug]: Episode start flag is never set for off policy algorithms #2011

josndan · 2024-09-20T22:04:47Z

🐛 Bug

In _sample_action of OffPolicyAlgorithm class, self.predict function is called. But episode_start flag is never set for any off policy algorithms.

To Reproduce

No response

Relevant log output / Error message

No response

System Info

No response

Checklist

My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
I have checked that there is no similar issue in the repo
I have read the documentation
I have provided a minimal and working example to reproduce the bug
I've used the markdown code blocks for both code and stack traces.

The text was updated successfully, but these errors were encountered:

araffin · 2024-09-22T18:45:00Z

Hello,
that's correct because there is current only RecurrentPPO that make use of states (LSTM states) and episode starts (to reset the states).

josndan added the bug Something isn't working label Sep 20, 2024

araffin added question Further information is requested and removed bug Something isn't working labels Sep 21, 2024

araffin closed this as completed Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Episode start flag is never set for off policy algorithms #2011

[Bug]: Episode start flag is never set for off policy algorithms #2011

josndan commented Sep 20, 2024

araffin commented Sep 22, 2024

[Bug]: Episode start flag is never set for off policy algorithms #2011

[Bug]: Episode start flag is never set for off policy algorithms #2011

Comments

josndan commented Sep 20, 2024

🐛 Bug

To Reproduce

Relevant log output / Error message

System Info

Checklist

araffin commented Sep 22, 2024