-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
first draft of splitting NWSS signals #1946
Conversation
Minh confirmed that this runs on her machine and that the output looks reasonable. A couple of things to do before we merge this:
|
df5aa19
to
e86e5fa
Compare
495f183
to
07c6c90
Compare
Currently not passing because of the same update that made nssp tests fail. Once that's merged and this is rebased it will pass. |
890e8b8
to
402f2ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments about style and comments. The actual functionality here looks fine.
This hasn't been released at all yet, right? You previously did statistical review on this; is there anything else we want to do for these new signals?
agg_df["geo_id"] = "us" | ||
return agg_df | ||
|
||
|
||
def add_needed_columns(df, col_names=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (optional): to make this more robust, add assert to make sure that our set of missing column names doesn't include important ones (like geo_id and value). Since this is out of scope, worth making an issue for
It has not. There's the corresponding docs, which I think Will read through at one point. |
closing as this is migrating to a new endpoint |
Description
This splits the dataset based on the provider and normalization (not every pair is actually present), and adds the metric signals. The resulting signals are called:
Some of these can have negative values; for e.g.
ptc_15d
, the values are small enough that I expect these may actually be exponents. Still looking into why the concentration data has negative values, which are too large to make sense as exponents.Fixes
generate_weights
in the case where some weights are negative, and added a test for it