-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Labels
area-thicketIssues and PRs involving Thicket's core Thicket datastructure and associated classesIssues and PRs involving Thicket's core Thicket datastructure and associated classespriority-normalNormal priority issues and PRsNormal priority issues and PRstype-featureRequests for new features or PRs which implement new featuresRequests for new features or PRs which implement new features
Description
In docstrings and docs, we refer users to pandas for documentation on aggregation functions. Despite this, we do not currently support an important way of specifying aggregation functions: string function names.
For example, currently, to use a "mean" operation in aggregation, we require users to do the following:
gb = thicket_obj.groupby(...)
gb.agg(numpy.mean)In comparison, it is much more common to do the following for a pandas Groupby-Aggregate:
df.groupby(...).agg("mean")We should also support string inputs to our GroupBy.agg method to be consistent with pandas.
Beyond consistency, there are 2 other reasons to do this:
- The logic behind a pandas mean (or similar operations) and a NumPy mean (or equivalent operations) are not the same. Current versions of pandas work around this by internally detecting when you pass NumPy functions in and replacing them with pandas' equivalents.
- Future versions of pandas (i.e., 3.0) will no longer replace NumPy functions with pandas' equivalents. That means there will be implications (e.g., performance) for using
"mean"overnumpy.mean. The behavior of the two will be different, and the NumPy functions may not produce correct output.
Metadata
Metadata
Assignees
Labels
area-thicketIssues and PRs involving Thicket's core Thicket datastructure and associated classesIssues and PRs involving Thicket's core Thicket datastructure and associated classespriority-normalNormal priority issues and PRsNormal priority issues and PRstype-featureRequests for new features or PRs which implement new featuresRequests for new features or PRs which implement new features