-
Notifications
You must be signed in to change notification settings - Fork 875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce core dependencies, split in optional dependencies #2265
Conversation
Performance benchmarks:
|
Yes, I am strongly in favor of a simple high-level API for the core functionality. Why should a novice user bother with the internal module structure of MESA?
Again, yes. The main issue is what is part of the default set of things that users might use for 80% of the models?
As far as I know, during coding dependencies are not checked so while running an import error will automatically be raised.
I'll try to look at this asap.
This PR is getting a bit big in terms of affected files. Might it be possible to split it into two: update toml, update docs? |
Thanks for the review! On point 2 and 3, the biggest issue arises from the updated __all__ = [
"Model",
"Agent",
"time",
"space",
- "DataCollector",
- "batch_run",
- "experimental",
] DataCollector, batch_run and experimental are removed because they contain dependencies that are note installed by default anymore. Not everyone might need a DataCollector, batch_run or experimental feature, so that's the idea. The thing is, ideally you don't import those three if they are not used. As far as I have found out, you have two options:
|
I quickly checked the toml file. It seems sensible not to always install, e.g., networks or Solara. I am less sure about pandas. In my view this is such a standard default library, why not always include this. TQDM is very lightweight as far as I know, so why not always also install this? My personal view on this, in general, is that this fine sliceing of dependencies is a bit odd given Python's overall batteries included philosophy. In particular, if dependencies are stable and readily available (so not requiring conda forge or so), I rarely see the point of not simply installing everything. |
I think there is a middle way feasible here. Don't underestimate how heavy pandas is though. One example is that they currently don't have Python 3.13 wheels (months after the betas, and two weeks before stable release), and building (in CI) takes a full 4 minutes. But I agree there not that many scenarios you don't need it. Including |
I'm not sure of the performance impact, but one could import |
I think we can do this quite elegantly: class DataCollector:
....
@cached_property
def _pd(self):
"""Lazy-loaded pandas module using cached_property."""
try:
import pandas as pd
return pd
except ImportError as e:
raise ImportError(
"Pandas is required for this operation. Please install it using 'pip install pandas'."
) from e
def get_model_vars_dataframe(self):
...
return self._pd.DataFrame(self.model_vars) Also, in the datacollector, Pandas is generally only used on methods that are ran after the model is done, like However, I'm also still considering making |
I was stating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EwoutH This is awesome!
I had one question on the import and thinking through latest and stable readthedocs pages. Not sure the best answer, except may be calling explicitly that there is a difference.
My thought on the discussion is and to @quaquel's comments of "batteries included" can you set it up so all
is installed by default (e.g. if no specification install all or rec) this is good for new users but for more advanced users they can specify specific dependencies they want. (e.g. which sub categories they want)
Thanks for reviewing! I answered your comment on the docs. I will try to update the introduction tutorial, but we probably have to rewrite large parts of it. I did a small update in #2315, but removing the schedulers requires completely rewriting the Agent activation parts. |
Ha just saw #2315 and went right into the code so didn't see everyone else comments, as such I updated my comment. Thanks for all this work, this is such great progress! |
Think about what to do with that
Okay, I vastly simplified this PR, by keeping Pandas and tqdm core dependencies. Now only the pyproject.toml and Readme are updated. Please review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry one last question about cookiecutter. But yes this looks fine and clear now.
Its indeed much better now, although pandas is really a mixed bag. Previously I would have said everyone who uses mesa should probably also use pandas anyway, but nowadays with polars, ibis and others this is much less true. And, as far as I remember, we only use pandas in one place, to convert the data collected data into a dataframe. I don't think that is really worth the dependency. But lets maybe do that in a separate PR, to move this one forward. |
I am inclined, as part of the broader overhoal of data collection, to no longer include any to_dataframe style helper method and leave this completely to the user. For example, as long as we store agent level data in a dict of dicts ( |
Fully agree
Also agree I'm merging, follow-up PRs can be made as everyone sees fit. |
This PR introduces a more flexible dependency management system for Mesa, allowing users to choose which additional components they need while maintaining core functionality.
Motivation
This change allows for more tailored installations, reducing unnecessary dependencies and potential conflicts, while ensuring core functionality remains intact.
Key Changes
pyproject.toml
to define new optional dependency groups:[network]
: Network-related dependencies[viz]
: Visualization dependencies[rec]
: Recommended dependencies (includes network and viz)[all]
: All dependencies, including developer dependenciesREADME.md
with new installation instructions for optional dependencies.Installation Options
Users can now customize their Mesa installation:
Impact
Points to Review
[network]
,[viz]
,[rec]
,[all]
) appropriate and sufficient?Closes #626