Design of subgraph_sum and subtree_sum leads to very suboptimal performance

Rule number 1 of any dataframe library is "don't do operations by iterating over rows." However, this is exactly what we do in `subgraph_sum` and `subtree_sum`. We need to refactor this to use a better mechanism (e.g., `DataFrame.apply`).

To get a sense of the performance impact, I can anecdotally say that `subgraph_sum` is 3-4x slower than the query language. And the query language is solving a version of subgraph isomorphism, an NP Hard problem. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Design of subgraph_sum and subtree_sum leads to very suboptimal performance #145

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Design of subgraph_sum and subtree_sum leads to very suboptimal performance #145

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions