-
Notifications
You must be signed in to change notification settings - Fork 943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Step index of DataCollector Dataframes are not synchronized between agent and model variables #1768
Comments
It's definitely confusing for the use case when you want to merge the 2 DF's. I don't have a good design solution for this. For now you can just do But even without merging, people could confuse the index of the model DF as step number. |
Thats what I ended up doing. I see that point that there is no "easy" fix for this. To lower the confusion, what about adjusting the docstring of "get_model_vars_dataframe"? Currently it states: Maybe changing it to something like: |
@jabrell we are marking this to be resolved before the next patch release. |
hello @rht, I was looking into this issue as preparation for gsoc24. Do you think it would be a good idea to change Also i found some TODO while going through the code, given the extensive discussions going on about revamping datacollector, is it a good idea to work on the TODOs? Because if you are planning to replace the datacollector with new code anyway then working on it wouldn't make sense. |
SGTM. There shouldn't be much performance hit due to switching to dictionary. It's fine to work on the TODO's for components that are about to be deprecated, for exercise/educational purpose. For example, |
Thank you, I'll get to work right away. |
Describe the bug
Using the DataCollector indexes for the model step of the DataFrames coming from .get_model_vars_dataframe() and .get_agent_vars_dataframe() are different if the DataCollector is called after the schedule.step(). I.e., if the model's step function looks like:
The "Step" index of the dataframe returned by .get_agent_vars_dataframe() will start with 1. In contrast, the index of the dataframe from get_model_vars_dataframe() will start with 0. This makes it diificult to merge frames in result processing.
Expected behavior
The indexes referring to the model step should be consistent. In the setting above, I would expect all step index starting at 1 (since the model has made one step before the report).
To Reproduce
The behavior can be reproduce by altering the step function to the code above in the MoneyModel in the Collecting data section of the introductory tutorial
Additional context
I am a newbie to Mesa, so I can only guess on the reasons. My first guess would be that the different index are caused by the different treatment of model and agent_reporters in the collect function of the DataCollector. The model_reporter seems to be list-based and the index is implicitly derived leading to the index starting at zero independently whether the collector is called before or after execution of the step. In contrast, the agent_reporters use the Scheduelers step property which seems to be advanced by one if the collector is called after execution of the step.
The text was updated successfully, but these errors were encountered: