Reduce core dependencies, split in optional dependencies #2265

EwoutH · 2024-09-01T17:41:24Z

This PR introduces a more flexible dependency management system for Mesa, allowing users to choose which additional components they need while maintaining core functionality.

Motivation

This change allows for more tailored installations, reducing unnecessary dependencies and potential conflicts, while ensuring core functionality remains intact.

Key Changes

Updated pyproject.toml to define new optional dependency groups:
- [network]: Network-related dependencies
- [viz]: Visualization dependencies
- [rec]: Recommended dependencies (includes network and viz)
- [all]: All dependencies, including developer dependencies
- Additional groups for development, examples, and documentation
Updated README.md with new installation instructions for optional dependencies.
Core dependencies now only include essential packages (numpy, pandas, tqdm).

Installation Options

Users can now customize their Mesa installation:

# Basic installation with core dependencies
pip install -U --pre mesa

# Install with recommended dependencies
pip install -U --pre mesa[rec]

# Install with specific additional dependencies
pip install -U --pre mesa[network,viz]

# Install all dependencies (including dev dependencies)
pip install -U --pre mesa[all]

Impact

Existing projects using only core Mesa functionality should not be affected.
Projects using network or visualization features may need to update their installation to include the appropriate optional dependencies.

Points to Review

Are the chosen dependency groups ([network], [viz], [rec], [all]) appropriate and sufficient?
Is the README clear about the new installation options?
Are there any parts of the documentation or examples that need updating to reflect these changes?

Closes #626

github-actions · 2024-09-20T12:53:44Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🔵 +1.1% [+0.0%, +2.1%]	🔵 -0.9% [-1.0%, -0.7%]
BoltzmannWealth	large	🔵 -0.3% [-1.0%, +0.5%]	🟢 -10.5% [-14.6%, -7.2%]
Schelling	small	🔵 -1.5% [-2.0%, -1.0%]	🔵 -2.6% [-3.2%, -2.0%]
Schelling	large	🔵 -2.0% [-2.8%, -1.1%]	🟢 -8.9% [-10.8%, -6.8%]
WolfSheep	small	🔵 -2.3% [-2.7%, -2.0%]	🔵 -1.9% [-2.2%, -1.6%]
WolfSheep	large	🔵 -1.0% [-1.9%, -0.0%]	🔵 -3.6% [-5.1%, -2.4%]
BoidFlockers	small	🟢 -5.3% [-5.7%, -4.9%]	🟢 -4.3% [-5.0%, -3.6%]
BoidFlockers	large	🟢 -4.8% [-5.1%, -4.5%]	🔵 -2.7% [-3.3%, -2.0%]

quaquel · 2024-09-20T13:06:48Z

Do we still want to allow direct imports for model and agents (like from mesa import Agent, Model)?

Yes, I am strongly in favor of a simple high-level API for the core functionality. Why should a novice user bother with the internal module structure of MESA?

Do we still want to allow direct imports like for space and time (like from mesa import RandomActivation, SingleGrid)?

Again, yes. The main issue is what is part of the default set of things that users might use for 80% of the models?

3 Do we want to allow direct imports for DataCollector and batch_run, if the required dependencies are installed? If so, what > do we do if that fails, throw a warning? What if it's the intended behavior for the user?

As far as I know, during coding dependencies are not checked so while running an import error will automatically be raised.

Any suggestions on how the dependency groups in pyproject.toml are structured?

I'll try to look at this asap.

Did I miss any documentation, tutorials, etc.

This PR is getting a bit big in terms of affected files. Might it be possible to split it into two: update toml, update docs?

EwoutH · 2024-09-20T13:31:38Z

Thanks for the review!

On point 2 and 3, the biggest issue arises from the updated __init__.py:

 __all__ = [
     "Model",
     "Agent",
     "time",
     "space",
-    "DataCollector",
-    "batch_run",
-    "experimental",
 ]

DataCollector, batch_run and experimental are removed because they contain dependencies that are note installed by default anymore. Not everyone might need a DataCollector, batch_run or experimental feature, so that's the idea.

The thing is, ideally you don't import those three if they are not used. As far as I have found out, you have two options:

Always attempt to import them
a. Warning if they can't be. But it's weird behavior, because why get a warning from a module you might not even need?
b. Do it silently. This may be confusing, why does it work sometime and other times not?
Try some lazy import evaluation. But that's difficult, since it's running in __init__.py. Haven't found a satisficing solution.

quaquel · 2024-09-20T13:37:47Z

I quickly checked the toml file. It seems sensible not to always install, e.g., networks or Solara. I am less sure about pandas. In my view this is such a standard default library, why not always include this. TQDM is very lightweight as far as I know, so why not always also install this? My personal view on this, in general, is that this fine sliceing of dependencies is a bit odd given Python's overall batteries included philosophy. In particular, if dependencies are stable and readily available (so not requiring conda forge or so), I rarely see the point of not simply installing everything.

EwoutH · 2024-09-20T14:08:03Z

I think there is a middle way feasible here. Don't underestimate how heavy pandas is though. One example is that they currently don't have Python 3.13 wheels (months after the betas, and two weeks before stable release), and building (in CI) takes a full 4 minutes. But I agree there not that many scenarios you don't need it.

Including tqdm and pandas in the core could be a way to go, to enable batch_run and DataCollector by default. Curious what the rest thinks about that.

rht · 2024-09-20T14:50:37Z

I'm not sure of the performance impact, but one could import pandas inside of the DataCollector class instead of globally within the datacollection.py file. This way, if an import error happens, you could provide a helpful error message saying that pandas needs to be installed. Needs to see how Solara does it with their FigureMatplotlib component (Matplotlib is not a dependency of Solara). Also, Solara has recently separated the heavy server code from their core package. I read it somewhere in a thread, that it is possibly solara-ui and solara-server, but couldn't find an evidence of this after reading their pyproject.toml.

EwoutH · 2024-09-20T20:14:22Z

I'm not sure of the performance impact, but one could import pandas inside of the DataCollector class instead of globally within the datacollection.py file.

I think we can do this quite elegantly:

class DataCollector:
    ....

    @cached_property
    def _pd(self):
        """Lazy-loaded pandas module using cached_property."""
        try:
            import pandas as pd
            return pd
        except ImportError as e:
            raise ImportError(
                "Pandas is required for this operation. Please install it using 'pip install pandas'."
            ) from e
            
    def get_model_vars_dataframe(self):
        ...
        return self._pd.DataFrame(self.model_vars)

Also, in the datacollector, Pandas is generally only used on methods that are ran after the model is done, like get_model_vars_dataframe. The lazy import allows to not install in on - let's say - a super computer, and do the analysis afterwards.

However, I'm also still considering making tqdm and pandas required dependencies again.

rht · 2024-09-21T18:46:37Z

I was stating pandas just as an illustration, as it applies to other optional dependencies such as matplotlib. This is how Solara does it for Altair: https://github.com/widgetti/solara/blob/2cf59d9bb40ed7d10d976c9c44c292d61e01ec89/solara/components/figure_altair.py#L24.

docs/tutorials/intro_tutorial.ipynb

tpike3

@EwoutH This is awesome!

I had one question on the import and thinking through latest and stable readthedocs pages. Not sure the best answer, except may be calling explicitly that there is a difference.

My thought on the discussion is and to @quaquel's comments of "batteries included" can you set it up so all is installed by default (e.g. if no specification install all or rec) this is good for new users but for more advanced users they can specify specific dependencies they want. (e.g. which sub categories they want)

EwoutH · 2024-09-22T12:15:05Z

Thanks for reviewing! I answered your comment on the docs.

I will try to update the introduction tutorial, but we probably have to rewrite large parts of it. I did a small update in #2315, but removing the schedulers requires completely rewriting the Agent activation parts.

tpike3 · 2024-09-22T12:19:17Z

Thanks for reviewing! I answered your comment on the docs.

I will try to update the introduction tutorial, but we probably have to rewrite large parts of it. I did a small update in #2315, but removing the schedulers requires completely rewriting the Agent activation parts.

Ha just saw #2315 and went right into the code so didn't see everyone else comments, as such I updated my comment.

Thanks for all this work, this is such great progress!

tpike3 · 2024-09-22T12:25:32Z

Just want to make sure you know the readthedocs build is failing due to datacollecter; it also seems there is a bunch of collisions on the names

Think about what to do with that

EwoutH · 2024-09-26T11:39:06Z

Okay, I vastly simplified this PR, by keeping Pandas and tqdm core dependencies.

Now only the pyproject.toml and Readme are updated. Please review.

pyproject.toml

quaquel

Sorry one last question about cookiecutter. But yes this looks fine and clear now.

pyproject.toml

Corvince · 2024-09-27T07:20:35Z

Okay, I vastly simplified this PR, by keeping Pandas and tqdm core dependencies.

Now only the pyproject.toml and Readme are updated. Please review.

Its indeed much better now, although pandas is really a mixed bag. Previously I would have said everyone who uses mesa should probably also use pandas anyway, but nowadays with polars, ibis and others this is much less true. And, as far as I remember, we only use pandas in one place, to convert the data collected data into a dataframe. I don't think that is really worth the dependency. But lets maybe do that in a separate PR, to move this one forward.

quaquel · 2024-09-27T07:26:27Z

Okay, I vastly simplified this PR, by keeping Pandas and tqdm core dependencies.
Now only the pyproject.toml and Readme are updated. Please review.

Its indeed much better now, although pandas is really a mixed bag. Previously I would have said everyone who uses mesa should probably also use pandas anyway, but nowadays with polars, ibis and others this is much less true. And, as far as I remember, we only use pandas in one place, to convert the data collected data into a dataframe. I don't think that is really worth the dependency. But lets maybe do that in a separate PR, to move this one forward.

I am inclined, as part of the broader overhoal of data collection, to no longer include any to_dataframe style helper method and leave this completely to the user. For example, as long as we store agent level data in a dict of dicts (dict[agent.unique_id] = {attr1:list, attr2:list} or something similar that is easy to convert to your dataframe library of choice, we don't need to have pandas as a dependency.

EwoutH · 2024-09-27T09:47:46Z

although pandas is really a mixed bag

Fully agree

But lets maybe do that in a separate PR, to move this one forward.

Also agree

I'm merging, follow-up PRs can be made as everyone sees fit.

README.md

EwoutH mentioned this pull request Sep 3, 2024

Remove all library requirements #626

Open

3 tasks

EwoutH added this to the v3.0 milestone Sep 3, 2024

EwoutH force-pushed the opt branch from 452303f to b9d3559 Compare September 20, 2024 11:32

This was referenced Sep 20, 2024

GoL_fast: Make datacollection import explicit projectmesa/mesa-examples#199

Merged

Make DataCollector and batch_run imports explicit projectmesa/mesa-examples#200

Draft

EwoutH added breaking Release notes label maintenance Release notes label labels Sep 20, 2024

EwoutH marked this pull request as ready for review September 20, 2024 12:46

EwoutH force-pushed the opt branch from b4ad094 to e4121a4 Compare September 20, 2024 12:49

EwoutH requested review from tpike3, rht, Corvince and quaquel September 20, 2024 12:50

tpike3 reviewed Sep 22, 2024

View reviewed changes

docs/tutorials/intro_tutorial.ipynb Outdated Show resolved Hide resolved

tpike3 requested changes Sep 22, 2024

View reviewed changes

Reduce core dependencies, split in optional dependencies

82d6ab2

EwoutH force-pushed the opt branch from cfa6234 to 82d6ab2 Compare September 26, 2024 11:31

EwoutH requested a review from tpike3 September 26, 2024 11:31

Keep cookiecutter as dev dep

f47e46a

Think about what to do with that

EwoutH removed the breaking Release notes label label Sep 26, 2024

quaquel reviewed Sep 27, 2024

View reviewed changes

pyproject.toml Show resolved Hide resolved

quaquel requested changes Sep 27, 2024

View reviewed changes

Corvince reviewed Sep 27, 2024

View reviewed changes

pyproject.toml Show resolved Hide resolved

EwoutH merged commit df51862 into main Sep 27, 2024
10 checks passed

EwoutH added the enhancement Release notes label label Sep 27, 2024

EwoutH deleted the opt branch September 27, 2024 09:51

rht reviewed Sep 27, 2024

View reviewed changes

README.md Show resolved Hide resolved

This was referenced Oct 9, 2024

Fix __all__ imports in __init__.pys #2343

Open

experimental init: Fix Solara import by making it lazy #2357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce core dependencies, split in optional dependencies #2265

Reduce core dependencies, split in optional dependencies #2265

EwoutH commented Sep 1, 2024 •

edited

Loading

github-actions bot commented Sep 20, 2024

quaquel commented Sep 20, 2024

EwoutH commented Sep 20, 2024 •

edited

Loading

quaquel commented Sep 20, 2024

EwoutH commented Sep 20, 2024

rht commented Sep 20, 2024

EwoutH commented Sep 20, 2024

rht commented Sep 21, 2024

tpike3 left a comment •

edited

Loading

EwoutH commented Sep 22, 2024

tpike3 commented Sep 22, 2024

tpike3 commented Sep 22, 2024

EwoutH commented Sep 26, 2024

quaquel left a comment

Corvince commented Sep 27, 2024

quaquel commented Sep 27, 2024

EwoutH commented Sep 27, 2024 •

edited

Loading

Reduce core dependencies, split in optional dependencies #2265

Reduce core dependencies, split in optional dependencies #2265

Conversation

EwoutH commented Sep 1, 2024 • edited Loading

Motivation

Key Changes

Installation Options

Impact

Points to Review

github-actions bot commented Sep 20, 2024

quaquel commented Sep 20, 2024

EwoutH commented Sep 20, 2024 • edited Loading

quaquel commented Sep 20, 2024

EwoutH commented Sep 20, 2024

rht commented Sep 20, 2024

EwoutH commented Sep 20, 2024

rht commented Sep 21, 2024

tpike3 left a comment • edited Loading

Choose a reason for hiding this comment

EwoutH commented Sep 22, 2024

tpike3 commented Sep 22, 2024

tpike3 commented Sep 22, 2024

EwoutH commented Sep 26, 2024

quaquel left a comment

Choose a reason for hiding this comment

Corvince commented Sep 27, 2024

quaquel commented Sep 27, 2024

EwoutH commented Sep 27, 2024 • edited Loading

EwoutH commented Sep 1, 2024 •

edited

Loading

EwoutH commented Sep 20, 2024 •

edited

Loading

tpike3 left a comment •

edited

Loading

EwoutH commented Sep 27, 2024 •

edited

Loading