A new blog post about Pytrees for Scientific Python #250

pfackeldey · 2025-05-14T23:39:47Z

The main subject relates to at least one project affiliated to the Scientific Python Ecosystem.
I have the right to publish the content under BSD 3-Clause License for the code and Creative Common CC-BY-4.0 License for the text.
Images have been compressed using a tool like pngquant.

This is a new blog post about "PyTrees for Scientific Python" by @mihaimaruseac @matthewfeickert and me. It explains what PyTrees are, where they come from, and how they can be useful for scientific Python.

It's currently in draft mode because we're missing a last section about "final thoughts" (pinging @matthewfeickert for this 🙏).

pfackeldey · 2025-05-18T20:07:18Z

I've added a short "final thought" section myself - so I'm marking this blogpost as ready for review.
Let me know what you think!

matthewfeickert · 2025-05-18T22:23:36Z

Thanks @pfackeldey. 🚀 Currently in transit to Houston so I will review tonight once I land, but as we discussed together I think this is all fine at the high level and any comments I have will be nitpicks.

matthewfeickert

@pfackeldey looks really good to me. As I said, the only thing I have to suggest is nitpicky typo fixes and possible rephrasing, but I think that the examples are very strong. Thank you for writing this!

content/posts/optree/pytrees/index.md

Co-authored-by: Matthew Feickert <[email protected]>

pfackeldey · 2025-05-19T15:19:12Z

Thanks @matthewfeickert , you're comments are great, I accepted all of them :)

content/posts/optree/pytrees/index.md

Co-authored-by: Mihai Maruseac <[email protected]>

content/posts/optree/pytrees/index.md

Co-authored-by: Stefan van der Walt <[email protected]>

This reverts commit bc98572.

rossbar

Very nice writeup @pfackeldey ! The post flows beautifully and IMO does a great job both motivating and demonstrating what PyTrees has to offer. A few grammatical nits that are by no means blockers; thanks for writing this up!

content/posts/optree/pytrees/index.md

Co-authored-by: Ross Barnowski <[email protected]>

stefanv · 2025-05-20T19:03:27Z

I'll review also, and then we can hopefully get this merged ASAP.

stefanv

Thanks @pfackeldey! Feel free to do with the comments what you want; let me know when you are ready to merge and I'll push the button.

content/posts/optree/pytrees/index.md

stefanv · 2025-05-20T00:18:18Z

content/posts/optree/pytrees/index.md

+## Manipulating Tree-like Data using Functional Programming Paradigms
+
+A "PyTree" is a nested collection of Python containers (e.g. dicts, (named) tuples, lists, ...), where the leafs are of interest.
+As you can imagine (or even experienced in the past), such arbitrary nested collections can be cumbersome to manipulate _efficiently_.


This motivation could be fleshed out a bit. I've never had to manipulate trees of dicts, but if I had to do so I'd probably pop them into a NetworkX graph. So, perhaps helpful to tell me what the types of issues are I might have run into, rather than expect me to know it already?

Ah, I've not thought about using NetworkX for such cases. This could probably be a solution(?), but the difference here is that here we're not interested in the graph properties. We just want to access the leaves - no matter the structure of the tree - and manipulate / work with them. I would not know how to do this with NetworkX (but I'm also not very familiar with the library).
PyTrees are mainly about separating the leaves from a generic structure. The generic structure itself is not of interest.

So, perhaps helpful to tell me what the types of issues are I might have run into, rather than expect me to know it already?

That's a good point! I'll try to rephrase this.

From what I've seen in code snippets written by scientists, it's common to collect/track arrays/lists of numerical data in python containers (dicts, tuples, lists), e.g. if they measure multiple quantities in multiple experiment runs (this may become nested and complex fast).

Ah, I've not thought about using NetworkX for such cases. This could probably be a solution(?), but the difference here is that here we're not interested in the graph properties. We just want to access the leaves - no matter the structure of the tree - and manipulate / work with them. I would not know how to do this with NetworkX (but I'm also not very familiar with the library).

@dschult Can NetworkX efficiently grab leaf nodes and do operations on them? And if not, why not ;)

To grab the nodes of a directed network in NetworkX use:
[node for node, deg in G.out_degree if deg == 0]

I think the tricky part for NetworkX would be handling unhashable objects. It might be possible to identify each container with an integer index to a list... It essentially flattens the PyTree while holding the relationships between containers in the networkx graph. From the list of tree containers along with a NetworkX graph, you could form a list of the leaf node indices as above, and then pass through the leaf level containers to manipulate the data inside each. If there are both values to be manipulated and containers in the same container it might be good to use two networkx graphs -- one with the container-container relationships and the other with container-value relationships. But I haven't thought much about how to identify containers vs values, etc.

Thanks for the explanation! I think that helps me understand even better why PyTrees can be useful.

Hi @stefanv,
I've improved the motivation in 9bde52e. I hope this motivates the use-cases for pytrees more :)

content/posts/optree/pytrees/index.md

stefanv · 2025-05-20T20:59:44Z

content/posts/optree/pytrees/index.md

+# )
+```
+
+The new `minimize` still works, because a `tuple` of `Params` is just _another_ PyTree!


I'm looking at this example, trying to imagine where a person would use such a thing. Not because I doubt it's useful, but because my experience is limited. A few more examples (not inline code examples, just a sentence or two listing them) may be helpful here.

Like, the generalization above required significant changes to the rosenbrock function itself, so it is unclear what the benefit is to not having to modify the minimizer at the same time. But it hints that there may be families of problems that can more easily be solved.

And that ties into the question:

Wouldn't it be nice to make workflows in the scientific Python ecosystem just work with any PyTree?

What would we need to do to make that happen, and under which circumstances would that be useful?

I think hierarchical models are something where this comes in very handy. It's not uncommon to fit a superposition of multiple functions (or pdfs, e.g. multiple gaussians) to data. Then it's nice to group those parameters semantically, instead of passing a flat list of floats / array of floats, where you have to remember what each value stands for.

What would we need to do to make that happen, and under which circumstances would that be useful?

I'm not suggesting we should change APIs in scipy and co, it's rather that PyTrees (with tools like optree) enable to make APIs like scipy.optimize.minimize work generically with any data structure (not just a flat list/array of floats) without too much code changes. It's something that users can opt-in to make their code more expressive as they're the ones that define the "model", here the rosenbrock_modified function.

Does that make sense? (Do you have a proposal to write this more clearly here?)

I wasn't sure how I could be more precise about hierarchical models, so I've added an example in the text that I think is a common in scientific research: e07a2ac

Co-authored-by: Stefan van der Walt <[email protected]>

content/posts/optree/pytrees/index.md

XuehaiPan · 2025-05-29T07:51:58Z

content/posts/optree/pytrees/index.md

+This allows for the application of generic transformations. For example, on a PyTree with NumPy arrays as leaves, taking the square root of each leaf with `tree_map(np.sqrt, pytree)`:
+
+```python
+import optree as pt


FYI, optree 0.16.0 is released. I wonder if it is worth mentioning optree.pytree.reexeport(...).

>>> import optree >>> pytree = optree.pytree.reexport(namespace='my-pkg', module='my_pkg.pytree') >>> pytree.flatten({'a': 1, 'b': 2}) ([1, 2], PyTreeSpec({'a': *, 'b': *}))

# foo/__init__.py import optree pytree = optree.pytree.reexport(namespace='foo') # foo/bar.py from foo import pytree @pytree.dataclasses.dataclass class Bar: a: int b: float print(pytree.flatten({'a': 1, 'b': 2, 'c': Bar(3, 4.0)})) # Output: # ([1, 2, 3, 4.0], PyTreeSpec({'a': *, 'b': *, 'c': CustomTreeNode(Bar[()], [*, *])}, namespace='foo'))

That's an awesome feature (which I'll certainly use for my libraries!).
I think this blog post is more targeted towards users that may want to try optree directly, and less towards library developers that may want to reexport with additional functionality. That's why I'm leaning towards not adding it to the blog posts. Does that make sense to you @XuehaiPan ?

It makes sense.

Co-authored-by: Xuehai Pan <[email protected]>

pfackeldey · 2025-06-17T14:31:51Z

Dear @stefanv, @rossbar, @XuehaiPan, @matthewfeickert, @mihaimaruseac,

thank you very much again for reviewing this blog post!

I've addressed the remaining comments now, could you give this blog post another look?
From my side this PR is now ready :)

Thanks, Peter

content/posts/optree/pytrees/index.md

stefanv

Thanks for the edits, Peter. I'm happy to merge this whenever you are ready. I highlighted two points of confusion for me, but happy to let you decide whether you want to handle those or not.

content/posts/optree/pytrees/index.md

stefanv · 2025-07-07T18:36:49Z

content/posts/optree/pytrees/index.md

+prediction = neural_network(layers=layers, x=jnp.array(...))
+```
+
+Here, `layers` is a PyTree &mdash; a `list` of multiple `Layer` &mdash; and the JIT compiled `neural_network` function _just works_ with this data structure as input.


This does not seem to be an example of a nested structure, just the final layer of a tree. Wouldn't it be pretty straightforward to handle a list using regular methods?

It's a ('3-level') nested datastructure in the sense: layers is a list (1st level) of Layers (2nd level) of W (3rd level [leaf]) and b (3rd level [leaf]). Here, one could also create a list for weights and one for biases and make a zip-loop over both, but then weights and biases are not logically grouped anymore in a Layer object.

Co-authored-by: Stefan van der Walt <[email protected]>

pfackeldey added 4 commits May 14, 2025 13:10

wip

4d71fbe

first complete version

dd6c7ca

first round of review improvements

e4804e1

add final thought section

d71c8e8

pfackeldey marked this pull request as ready for review May 18, 2025 20:07

pfackeldey added 2 commits May 18, 2025 16:13

improve code snippet for modified rosenbrock

653296b

remove obsolete comment

628d645

matthewfeickert suggested changes May 19, 2025

View reviewed changes

pfackeldey and others added 7 commits May 19, 2025 11:16

Update content/posts/optree/pytrees/index.md

7d6b287

Co-authored-by: Matthew Feickert <[email protected]>

Update content/posts/optree/pytrees/index.md

99fe18c

Co-authored-by: Matthew Feickert <[email protected]>

Update content/posts/optree/pytrees/index.md

cd443a3

Co-authored-by: Matthew Feickert <[email protected]>

Update content/posts/optree/pytrees/index.md

13d1700

Co-authored-by: Matthew Feickert <[email protected]>

Update content/posts/optree/pytrees/index.md

17f2f95

Co-authored-by: Matthew Feickert <[email protected]>

Update content/posts/optree/pytrees/index.md

b08dcb4

Co-authored-by: Matthew Feickert <[email protected]>

Update content/posts/optree/pytrees/index.md

76bbff4

Co-authored-by: Matthew Feickert <[email protected]>

mihaimaruseac reviewed May 19, 2025

View reviewed changes

content/posts/optree/pytrees/index.md Outdated Show resolved Hide resolved

mihaimaruseac approved these changes May 19, 2025

View reviewed changes

Update content/posts/optree/pytrees/index.md

7a124b1

Co-authored-by: Mihai Maruseac <[email protected]>

stefanv reviewed May 20, 2025

View reviewed changes

content/posts/optree/pytrees/index.md Show resolved Hide resolved

pfackeldey and others added 4 commits May 20, 2025 10:15

Update content/posts/optree/pytrees/index.md

19262dd

Co-authored-by: Stefan van der Walt <[email protected]>

add summary

bc98572

Revert "add summary"

11fde89

This reverts commit bc98572.

add summary

0555d93

rossbar approved these changes May 20, 2025

View reviewed changes

content/posts/optree/pytrees/index.md Outdated Show resolved Hide resolved

content/posts/optree/pytrees/index.md Outdated Show resolved Hide resolved

pfackeldey and others added 2 commits May 20, 2025 13:13

Update content/posts/optree/pytrees/index.md

4db8fba

Co-authored-by: Ross Barnowski <[email protected]>

Update content/posts/optree/pytrees/index.md

50a3b2e

Co-authored-by: Ross Barnowski <[email protected]>

matthewfeickert approved these changes May 20, 2025

View reviewed changes

stefanv reviewed May 20, 2025

View reviewed changes

Update content/posts/optree/pytrees/index.md

4dbf7fd

Co-authored-by: Stefan van der Walt <[email protected]>

XuehaiPan reviewed May 29, 2025

View reviewed changes

content/posts/optree/pytrees/index.md Outdated Show resolved Hide resolved

XuehaiPan reviewed May 29, 2025

View reviewed changes

content/posts/optree/pytrees/index.md Outdated Show resolved Hide resolved

XuehaiPan reviewed May 29, 2025

View reviewed changes

pfackeldey and others added 5 commits June 17, 2025 09:38

Update content/posts/optree/pytrees/index.md

ae7764e

Co-authored-by: Xuehai Pan <[email protected]>

pytree -> tree

8c33d24

be more specific about the motivation for scientific data

9bde52e

mention optree earlier

5e59e37

be a bit more specific about the use case with hierarchical models

e07a2ac

Merge branch 'main' into pytree

89d464f

XuehaiPan reviewed Jun 18, 2025

View reviewed changes

content/posts/optree/pytrees/index.md Outdated Show resolved Hide resolved

XuehaiPan approved these changes Jun 18, 2025

View reviewed changes

leafs -> leaves

4c548bc

mihaimaruseac approved these changes Jun 18, 2025

View reviewed changes

stefanv reviewed Jul 7, 2025

View reviewed changes

pfackeldey and others added 3 commits July 7, 2025 15:47

Update content/posts/optree/pytrees/index.md

254ad90

Co-authored-by: Stefan van der Walt <[email protected]>

be more explicit about reducing arrays with more than 1 element

b5acc95

add a sentence of how pytrees and jax.jit work together

4e14a11

A new blog post about Pytrees for Scientific Python #250

Are you sure you want to change the base?

A new blog post about Pytrees for Scientific Python #250

Conversation

pfackeldey commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pfackeldey commented May 18, 2025

Uh oh!

matthewfeickert commented May 18, 2025

Uh oh!

matthewfeickert left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pfackeldey commented May 19, 2025

Uh oh!

Uh oh!

Uh oh!

rossbar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

stefanv commented May 20, 2025

Uh oh!

stefanv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stefanv May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pfackeldey commented Jun 17, 2025

Uh oh!

Uh oh!

stefanv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pfackeldey commented May 14, 2025 •

edited

Loading

stefanv May 21, 2025 •

edited

Loading