-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changed ::abs to ::fabs in bool fba::Hardware::position_xy_bad which #470
base: main
Are you sure you want to change the base?
Conversation
caused issue #258. r_min was being calculated as 0 in the case where it was ::abs(-0.1718) in gcc 11.2 in desi 23.10 and this was leading to a different result than gcc 13 with desi 24.4 where it was correctly resolved to the abs of a double. Note that it seems from this like the newer version actually has the correct behavior, opposite of what was expected, unless the desired behavior is to treat r_min as an int.
Wow, great find, thank you! I agree that that just looks like a bug. It will clearly change fiber assignment, etc., but the "old fiber assignment" was just broken. We should probably discuss on the Monday survey ops call what we want to do. Our usual policy is to keep fiber assign reproducible, which would mean doing something ridiculous like "keep this buggy behavior for early dates". It feels weird to follow that policy in this case, though. |
If 0 -> use old 'buggy' compiler specific ::abs call If 1 or undefined, use ::fabs While implementing this I found an interesting behavior that should be documented and took me a while to figure out. If I include <stdlib.h> then ::abs always performs correctly as a floating point abs. I had thought I needed to include stdlib.h to handle the env var, though it turned out I did not. This may be useful in tracking down exactly where the discrepancy occurs since we see a difference between desi 23.10 / gcc 11.2 and desi 24.4 / gcc 11.2 - it is probably in libstdc++ or glibc or some other library module that is loaded.
Hi Craig---two questions here:
I think in your implementation this would look like
which is fine. But I expect that that getenv only gets called at the initial import time and so the updated environment variable wouldn't get seen without some weird tricks to force that module to get reloaded. But maybe I'm wrong and the alt-MTL always comes in through the CLI or something and triggers that 'getenv' call to be executed anew. @araichoor or @ashleyjross , can you advise on what the right entry point into fiberassign is to set/check this environment variable / trigger this new behavior? Thanks! |
For the altmtl, currently, a separate .sh script gets created for every tile and the needed environment is loaded for it. So, it would be easy to add this new environment variable in there. (This dates back to SV3, where many different versions of fiberassign were required to reproduce everything properly. I had been thinking recently that it might be nice to not do things this way, but it would have been some work to change it...) |
Thanks for the great detective work! Two more questions:
|
@schlafly 1. Sure - I could force it to be an int. @ALL I think there are a lot of ways so however its decided is the best/easiest to set the environment variable (and how we want to handle the undefined case which I defaulted to using the new/correct logic) - just let me know if there's anything else on my end (I could move the getenv within the method for instance). And I think Anand's python script can serve as the basis for a easy unit test I could make since we have a case known to be an issue. I'll work on that today. |
Yeah, moving the getenv within the method would be an option if that isn't a hot path or getenv is really fast. I have no idea. If it doesn't appreciably impact performance I agree that that would be a good approach. I was trying to think of a place less likely to be a hot path, though. |
@schlafly I also changed the behavior to cast as an int in the case where FIBERASSIGN_BEHAVIOR=0 so that we get reproducible 'buggy' behavior regardless of compiler and environment. Last thing I was working on was a unit test but then I realized while working on it I need clarification of what exactly you want the unit test to do? Since this is not a return value - do you want to test by asserting that the boolean return value is what we'd expect for that particular tile/fiber if the behavior is as expected based on the env var? E.g.
This conditional returns true when the buggy behavior is enabled but a false is returned later when it is not. So we could test setting env var to 0, assert return is true, set env var to 1, assert return is false. Or should we set up a separate method just to test the int vs double value of abs? But then that's a separate method so the unit test wouldn't even notice if someone changed things in |
Updated logic to force "buggy" behavior of casting abs as int when FIBERASSIGN_BEHAVIOR=0.
Great, thanks Craig. I'm not personally very worried about the unit test. It will be adding very good coverage of a very specific case that's unlikely to reappear. But yes, I think that your proposed unit test makes sense---checking that we get the expected true / false depending on the buggy-behavior-flag for this specific combination. If that is too ugly to set up, though, I wouldn't bother---I don't know how good our test coverage here is in general. |
I went ahead and added a unit test to |
not have the DESI environment set up correctly so this would require a more compilicated (or better abstracted) unit test. Since it sounds like its not a big deal I will just remove it.
@schlafly turns out the unit test I wrote fails on github's automatic testing on the line
I think clearly the github environment isn't set up to be the same as the DESI environment, nor does it have access likely to that particular tile/fiber. So the unit test would either need to be much more complex or much more abstracted and since you said don't bother if its too ugly to set up, I just removed it. |
src/hardware.cpp
Outdated
logger.debug("FIBERASSIGN_BEHAVIOR = 1 -- using ::fabs correctly on r_min"); | ||
r_min = ::fabs(r_min); | ||
} | ||
std::cout << "R_MIN = " << r_min << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to ourselves: we ll want to disable this printing of R_MIN
, as it "pollutes" the fiberassign-TILEID.log files.
here s an example of a file run with that: https://data.desi.lbl.gov/desi/users/sybenzvi/fa_holding_pen_bugfix_issue_258_main_20250324/002/fiberassign-002212.log
I committed a few updates to the coordinate transformations in Note that unit tests aren't updated to the new interface; that will wait for our decision. |
caused issue #258. r_min was being calculated as 0 in the case where it was ::abs(-0.1718) in gcc 11.2 in desi 23.10 and this was leading to a different result than gcc 13 with desi 24.4 where it was correctly resolved to the abs of a double.
Note that it seems from this like the newer version actually has the correct behavior, opposite of what was expected, unless the desired behavior is to treat r_min as an int.