-
Notifications
You must be signed in to change notification settings - Fork 13
v0.6.0 - better modules #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…tions for building crypto data, add unit tests for new functions, refactor existing functions into more appropriately named modules and avoid circular imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Refactors existing functions into more appropriately named modules to avoid circular imports and better organize the codebase, while adding new crypto-specific data transformation functions.
- Moves mathematical functions from
scoring.pyto newmath.pymodule - Creates new
indexing.pymodule for index manipulation functions - Adds new
data.pymodule withbalanced_rank_transformandquantile_binfunctions for crypto data processing
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_scoring.py | Removes tests for moved functions while keeping scoring-specific tests |
| tests/test_math.py | New test file for mathematical functions moved from scoring module |
| tests/test_indexing.py | New test file for index manipulation functions |
| tests/test_data.py | New test file for data transformation functions |
| pyproject.toml | Updates version to 0.6.0.dev0 |
| numerai_tools/typing.py | New module defining type variables for DataFrame/Series unions |
| numerai_tools/signals.py | Updates imports to use new module structure |
| numerai_tools/scoring.py | Refactored to import from new modules and focus on scoring functions |
| numerai_tools/math.py | New module containing mathematical transformation functions |
| numerai_tools/indexing.py | New module for index filtering and sorting functions |
| numerai_tools/data.py | New module with crypto data transformation functions |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <[email protected]>
andresnumer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because correlation and dot products (subject to an L2 constraint) are so similar, the L2-based MPC is just like MMC but with weights rather than raw predictions (and a slightly different scaling factor).
Here's an implementation of the correct version
def meta_portfolio_contribution(
predictions: pd.DataFrame,
stakes: pd.Series,
neutralizers: pd.DataFrame,
sample_weights: pd.Series,
targets: pd.Series,
) -> pd.Series:
"""Calculates the "meta portfolio" score:
- rank, normalize, and power each signal
- convert each signal into neutralized weights
- generate the stake-weighted portfolio
- calculate the gradient of the portfolio w.r.t. the stakes
- multiplying the weights by the targets
Arguments:
predictions: pd.DataFrame - the predictions to evaluate
stakes: pd.Series - the stakes to use as weights
neutralizers: pd.DataFrame - the neutralization columns
sample_weights: pd.Series - the universe sampling weights
targets: pd.Series - the live targets to evaluate against
"""
targets = center(targets)
predictions, targets = filter_sort_index(predictions, targets)
stake_weights = weight_normalize(stakes.fillna(0))
assert np.isclose(stake_weights.sum(), 1), "Stakes must sum to 1"
weights = generate_neutralized_weights(predictions, neutralizers, sample_weights)
w = cast(np.ndarray, weights[stakes.index].values)
s = cast(np.ndarray, stake_weights.values)
t = cast(np.ndarray, targets.values)
swp = w @ s
swp = swp - swp.mean()
l2_norm = np.sqrt(np.sum(swp**2))
residualized_weights = orthogonalize(w, swp)
mpc = (residualized_weights.T @ t).squeeze() / l2_norm
return pd.Series(mpc, index=stakes.index)
dd79bdc to
a305e00
Compare
|
The previous code had a small issue with not making each user's weights zero-mean (important for the correctness of the MPC calculation). This version also has the targets centered post-filtering (center earlier if that's an issue) |
Uh oh!
There was an error while loading. Please reload this page.