You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Order of execution between FixedUpdate and OnEpisodeBegin is different, depending on how episode ended/started.
In the first episode after running the game, FixedUpdate with StepCount == 0 is called before OnEpisodeBegin, causing incorrect reward and possible errors due to incomplete initialization.
This would be less of an issue if this was consistent with other episodes, but in an episode after MaxStep was reached, it is different and FixedUpdate called after OnEpisodeBegin.
I have not tested what happens with EndEpisode, but this might also be different.
To Reproduce
Open CrawlerAgent script
Add changes to the script (described below)
Set MaxStep to small number, for example 5
Disable all copies of Agent except one
Enable "Pause" and then click "Play"
Click "Step" button a couple of times, until second episode starts
See in logs: the order is not consistent, and OnEpisodeBegin already has reward from FixedUpdate
public override void OnEpisodeBegin()
{
Debug.Log($"OnEpisodeBegin: step={StepCount}, reward={GetCumulativeReward()}");
Console logs / stack traces / screenshots
I waited a couple of seconds between each "step" click, so that you can see which operations were in one frame.
In first case OnEpisodeBegin called after StepCount=0 (not before!)
In the second case immediately after StepCount=4 (not before StepCount=0)
Environment (please complete the following information):
Unity Version: Unity 6000.0.26f1
OS + version: Windows 11
ML-Agents version: release_22 / 3.0.0
Torch version: 2.2.2+cu121
Environment: Crawler
The text was updated successfully, but these errors were encountered:
Describe the bug
Order of execution between FixedUpdate and OnEpisodeBegin is different, depending on how episode ended/started.
In the first episode after running the game, FixedUpdate with StepCount == 0 is called before OnEpisodeBegin, causing incorrect reward and possible errors due to incomplete initialization.
This would be less of an issue if this was consistent with other episodes, but in an episode after MaxStep was reached, it is different and FixedUpdate called after OnEpisodeBegin.
I have not tested what happens with EndEpisode, but this might also be different.
To Reproduce
5
Changes to the Crawler environment
Console logs / stack traces / screenshots
I waited a couple of seconds between each "step" click, so that you can see which operations were in one frame.

In first case OnEpisodeBegin called after StepCount=0 (not before!)
In the second case immediately after StepCount=4 (not before StepCount=0)
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: