-
Notifications
You must be signed in to change notification settings - Fork 37
Job creator bugfix #1010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Job creator bugfix #1010
Conversation
… possible by docstrin), no flag files were found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check with @k1o0 but I think the original intent was to be able to operate within a session path or at a level containing multiple session paths using the same glob pattern, while providing some validation.
Do you have more info about your failure case ? It would be nice to fix it while maintaining compatibility with the 2 requirements above !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed Georg's change here would break new session registration on all local servers. The primary use of the function is to register all sessions in the provided root (i.e. /mnt/s0/Data/Subjects
) path. The docstring does look outdated which is my bad: it was originally a simple rglob and I added the specific wildcards primarily to improve performance and to skip any 'junk' folders. I agree with Georg that it would be convenient to use this function to register individual sessions from time to time. I can add an if-else statement here to support session path inputs again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't push anything until I've fixed the CI; the certificates have expired on the hooks instance and we should be sure that the integration tests would have caught this potentially disastrous change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the docstring says:
def job_creator(root_path, one=None, dry=False, rerun=False):
...
Parameters
----------
root_path : str, pathlib.Path
Main path containing sessions or a session path.
however, when this function is called with a session path (and not a root path) it fails to find the raw_session.flag
file.
Here is a minimal working example, to be executed on parede:
from one.api import ONE
from ibllib.pipes import local_server
from pathlib import Path
base_path = Path('/mnt/s0/Data/Subjects')
one = ONE(cache_rest=None)
eid = "b2f0eb1e-88d9-4d4c-9938-5ff4df2cb7fc" # a session with passive
session_path = base_path / one.eid2path(eid).session_path_short()
print((session_path / 'raw_session.flag').exists()) # True
print(session_path) # /mnt/s0/Data/Subjects/ZFM-08652/2025-07-02/002
local_server.job_creator(session_path, one=one, dry=False) # doesn't find anything
globbing won't work as it starts matching from root directory. In fact in won't work for anything else besides the /mnt/s0/Data/Subjects
folder.
So either: 1) the docstring needs to be changed to
Parameters
----------
root_path : str, pathlib.Path
path to the folder that contains the subject folders
or 2) flag_files = Path(root_path).glob('**/raw_session.flag')
Relaxing the validation and replacing it with a .glob('**/raw_sessions.flag')
will find all folders with a raw sessions flag regardless of the entry point, which is useful if for example a single session or a single animal is to be processed.
Why would this break new session registration @k1o0 ? There are validating steps in the following lines that might take care of trash folders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flag_files = Path(root_path).glob('**/raw_session.flag')
flag_files = [file for file in flag_files if re.search(r"/\d{4}-\d{2}-\d{2}/\d{3}/raw_session\.flag$", str(file))]
could be a compromise with some degree of validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me fix this one up. We now have ONE methods for these globs that are more robust (e.g. the session number can be a single digit on whiterussian). It's not so efficient to apply a regex after a glob (effectively regex twice) when you can simply check if the input path is already a session path. Let me fix the integration server first because this task is extremely important so I don't want to merge without ensuring that it's correctly tested and that the tests pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me fix this one up.
ofc :) I am not doing anything
It's not so efficient to apply a regex after a glob (effectively regex twice) when you can simply check if the input path is already a session path
ibllib/ibllib/pipes/local_server.py
Line 110 in 78e82df
flag_files = filter(lambda x: is_session_path(x.parent), flag_files) |
… possible by docstring), no flag files were found