-
-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Weather data has duplicate index values, rows and object column types #669
Comments
The relevant code just picks the first weather data value for each lap and concatenates all the values. If there are multiple drivers, laps can of course take place simultaneously and the same weather data info can be returned multiple times. I probably didn't consider the case of calling this function on a Your "hacky" fix seems pretty OK, to be honest. Except that I'd prefer to not reindex. This way, the data stays a slice of Disregarding my opinion here, what did you expect the function to return? Maybe we need to reconsider this and change the behaviour to better match expectations. I'd like to hear your opinion on that. In general, I think this would be a relatively simple fix. So once we agree on what should be done here, I'm open to a pull request for this :) |
My possible use case was a quick weather chart under a chart showing when cars were out on track during a practice session. The reindex was just to clean things up a bit. One of the major issues with the returned df as is was the duped index values. |
With this approach, there will be gaps in the weather chart whenever no cars are on track. Is this what you intended to achieve? I'm not sure yet, if this discussion is relevant to the bug itself, but I'm trying to understand the usage of this function a bit more. The bug is easy enough to understand 😅 |
Re: gaps.. probably not.. (I've only just started exploring how I can use the package, but that's often the time when I'm likely to use things in a "not meant to be used like that" way...;-) ( The way I use data is to look for stories. So the gappy data is useful for talking about weather when a driver is on track, eg on a long run, but maybe not so useful to explain eg why there are no cars running (tho' if it starts raining and everyone pits that's a clue!) |
I have thought about this some more and about potential use cases (and looked at my own documentation😅). I now think that it's best to leave this as it is. There are advantages to having the duplicates, and there are advantages not having them. For example, the documentation shows an example where weather data is concatenated with timing data such that there is one weather data value for each lap. This requires the duplicates to exist. Changing this behaviour would be an API change that breaks backwards compatibility. I no longer think this is a bug. And I don't see a compelling enough reason to change the API behaviour here. Specifically because there are reasons for both ways to be useful. Instead, I'll clarify the documentation to make this behaviour more clear so that users aren't surprised or confused by it. |
Describe the issue:
The dataframe that contains weather data:
np.float
,boolean
orTimedelta
values)Reproduce the code example:
Error message:
The text was updated successfully, but these errors were encountered: