-
Notifications
You must be signed in to change notification settings - Fork 27
Search_for_wildcards function updated to add @DATETO@ functionality #510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@adamltyson @JoeZiminski |
|
Hello @Diya910, |
|
Hi @Diya910 so sorry for the delay in response! thanks a lot for this PR and the extensive tests. I'm still not back full time but will definitely have time to review this within the next two weeks. Thanks for your patience |
JoeZiminski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Diya910 thanks a lot for this, its a really nice implementation and is exactly what we need to do in this case. I have left a few comments on refactoring, this is because the introduced functionality can be aligned with some existing code to reduce duplication across the codebase. This requires some massaging of existing datashuttle code to make it a little more general so it can be called here. The suggestions also extend the implementation to handle the TIMETO and DATETIMETO case. For now I have not reviewed the tests as they might need changing after the refactor, but in general they look good and the attention to detail on testing is much appreciated.
Let me know if anything is not clear and if you have any questions or alternative ways to tackle this. Refactorings like those suggested can be a little fiddly. The linting / type checking will be useful when performing such refactorings. Of course, I'm happy to help wherever it would be useful. Thanks again for this contribution!
Just a reminder to myself, we will also need to add documentation for this new functionality.
datashuttle/utils/folders.py
Outdated
| if canonical_tags.tags("*") in name or "@DATETO@" in name: | ||
| search_str = name.replace(canonical_tags.tags("*"), "*") | ||
| # If a date-range tag is present, extract dates and update the search string. | ||
| if "@DATETO@" in name: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a canonical tags.tags() function that contains all the tags (just in case we change them or some other problem that requires their editing arises). So @DATETO@, @TIMETO@ and @DATETIMETO@ could be added to that function and here @DATETO@ replaced with tags.tags("DATETO")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added DATETO, TIMETO, and DATETIMETO to canonical_tags.py and using them through tags()
|
Hey @Diya910 do you think you would be interested in continuing to work on this PR? This is a great addition and it would be nice to release it in a version soon. I'm happy to finalise the PR as most of the work now is just refactoring into the existing codebase. |
Yes yes, I am interested. I was busy with my exams and other stuffs. Just allow me a day or two. I'll do the required changes suggested by you. |
|
Hey @Diya910 great! No rush BTW I was just checking in, please prioritise exams / other stuff / taking some time to recuperate after exams. I was thinking it might be nice to merge over the next few weeks (rather than next few days), thanks! |
|
Thanks, I'll try to work on it as soon as possible. |
…ion of code my making functions in validation.py and using in search_with_tags feature in folders file
|
Hey, @JoeZiminski I have probably done all the changes suggested by you and also centralized the code. I have also changed the test file with additional test functions, everything is working fine from side. If any other changes are required, please let me know. I will do them at the earliest. |
|
Hi @Diya910 thanks a lot for this! Will review tomorrow |
JoeZiminski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Diya910 thanks for this, this is really great stuff. The code is very clean, this is going to make a great feature. I have left a few comments on the code, they just suggest some minor refactoring's to reduce code duplication where possible. For critical code, it makes sense to define the key parts only in once place, just in case they are changed later but the editor forgets to check for all places they are defined.
The tests are great for ensuring the features works well, I have suggested a refactoring here to use our existing testing machinery which I think should reduce some boilerplate, let me know if you have any questions about this. The tests will should probably test all three cases, dateto, timeto and datetimteto, happy to help with this.
I just pushed some fixes to the pre-commit on the CI which was failing, just some minor typing issues (see here for some detail on the pre-commit hooks). This should move on to the full test suite now.
Thanks again Diya this is nearly done! I just remembered we will also need to document this change, the contributing guide for this is here. It would make sense to add the new tags to this section. Happy to do this because the documentation can be a bit fiddly, but if you are interested in this please feel free to go ahead, let me know if you have any questions!
|
Hey @JoeZiminski, I have done changes required by you. These were a lot of changes I am not able to reply to all of them individually. But I made sure to make changes suggested by you. I have tested the changes on draft test file and they are working fine. I haven't properly done work on test file. It was a lot for me to do in a go. Once you confirm these changes I'll move ahead in refactoring test file. I hope you are fine with it. If I missed any suggestion above just in case, please point out to that I'll make those changes. |
JoeZiminski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Diya910 this is great, definitely good to go bar some very minor suggestions. Most of these are minor github code suggestions so you can directly commit them.
Apologies, one of my suggestions was actually worse than what was already there 😅 around the walrus operator. Sorry for the inconvenience of having to revert this.
After these changes are integrated I will message @Akseli-Ilmanen to test this manually while the other tests are been written. Let me know if you have any questions as you refactor the tests. Thanks again!
Co-authored-by: Joe Ziminski <[email protected]>
Co-authored-by: Joe Ziminski <[email protected]>
Co-authored-by: Joe Ziminski <[email protected]>
Co-authored-by: Joe Ziminski <[email protected]>
Co-authored-by: Joe Ziminski <[email protected]>
…into date_feature
|
@JoeZiminski I have done all the changes. Please have a look. I am not sure about if I have removed declarations the right way. Please let me know if you want me to change docstrings in any specific way. Thankyou |
for more information, see https://pre-commit.ci
…into date_feature
for more information, see https://pre-commit.ci
…into date_feature
|
Hey @Akseli-Ilmanen thanks for these suggestions, on this PR it should now be possible to do things like: And equivalently this can be done through the TUI. Note that for the transfer to work, every Would be great to hear how this works for you! You can install from this PR by doing: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements datetime range filtering functionality for datashuttle, allowing users to select folders based on date, time, or datetime ranges. The feature introduces three new wildcard tags (@Dateto@, @TimeTo@, @DATETIMETO@) and refactors datetime handling throughout the codebase for improved consistency and maintainability.
Key changes:
- Added datetime range search functionality via new @Dateto@, @TimeTo@, and @DATETIMETO@ tags
- Refactored datetime formatting to separate value generation from key-prefixing (e.g.,
format_datetimevsformat_datetime_with_key) - Improved regex patterns for datetime validation using more concise notation (\d{8} vs \d\d\d\d\d\d\d\d)
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_date_search_range.py | New comprehensive test suite for datetime range search functionality covering date, time, and datetime ranges with edge cases |
| tests/tests_integration/test_validation.py | Added integration tests for datetime tag validation after sub/ses keys |
| tests/tests_unit/test_validation_unit.py | Updated unit tests to reflect more concise regex notation for datetime patterns |
| datashuttle/configs/canonical_tags.py | Added new datetime range tags and centralized datetime format definitions |
| datashuttle/utils/validation.py | Refactored datetime validation to use centralized format definitions and extracted ISO format validation |
| datashuttle/utils/formatting.py | Refactored datetime formatting to support both keyed and non-keyed formats, handling sub-/ses- prefixed datetime values |
| datashuttle/utils/folders.py | Implemented core datetime range search logic with filtering, validation, and glob pattern generation; renamed search_for_wildcards to search_with_tags |
| datashuttle/utils/data_transfer.py | Updated function call to use renamed search_with_tags function |
| pyproject.toml | Changed mypy configuration to use overrides syntax for ignoring test errors |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| assert sorted(transferred_sessions) == sorted(expected_sessions) | ||
|
|
||
| def test_date_as_sub_or_ses_value(self, project): | ||
| """ """ |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing docstring for test method. This test verifies that date values can be used directly in subject or session names (without the "date-" prefix). Add a descriptive docstring like: "Test date range filtering when dates are used directly as sub/ses values (e.g., ses-20240301)."
| """ """ | |
| """Test date range filtering when dates are used directly as sub/ses values (e.g., ses-20240301).""" |
| ) -> None: | ||
| """Replace tags with their final value for every name in a list. | ||
| @DATE@, @TIME@ and @DATETIME@ keys can be positioed directly |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling error: "positioed" should be "positioned".
| @DATE@, @TIME@ and @DATETIME@ keys can be positioed directly | |
| @DATE@, @TIME@ and @DATETIME@ keys can be positioned directly |
|
|
||
| def find_datetime_in_name( | ||
| name: str, format_type: str, tag: str | ||
| ) -> tuple[str | Any, ...] | None: |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent return type annotation. The function signature declares -> tuple[str | Any, ...] | None but the docstring says tuple[str, str] | None. The actual return type should be tuple[str, str] | None since match.groups() from a pattern with two capture groups returns a tuple of two strings. Update the function signature to -> tuple[str, str] | None.
| ) -> tuple[str | Any, ...] | None: | |
| ) -> tuple[str, str] | None: |
| def test_without_wildcard_ses(self, project): | ||
| """Test without wildcard ses. | ||
| Including @*@ only led to an uncaught but as it was triggering a |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling error: "but" should be "bug". The comment should read "Including @*@ only led to an uncaught bug..."
| Including @*@ only led to an uncaught but as it was triggering a | |
| Including @*@ only led to an uncaught bug as it was triggering a |
| assert sorted(transferred_sessions) == sorted(expected_sessions) | ||
|
|
||
| def test_time_as_sub_or_ses_value(self, project): | ||
| """ """ |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing docstring for test method. Add a descriptive docstring like: "Test time range filtering when times are used directly as sub/ses values (e.g., ses-110101)."
| """ """ | |
| """Test time range filtering when times are used directly as sub/ses values (e.g., ses-110101).""" |
| assert sorted(transferred_sessions) == sorted(expected_sessions) | ||
|
|
||
| def test_datetime_as_sub_or_ses_value(self, project): | ||
| """ """ |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing docstring for test method. Add a descriptive docstring like: "Test datetime range filtering when datetimes are used directly as sub/ses values (e.g., ses-20240301T110101)."
| """ """ | |
| """Test datetime range filtering when datetimes are used directly as sub/ses values (e.g., ses-20240301T110101).""" |
| def run_session_upload( | ||
| self, project, subs, sessions, session_search_string | ||
| ): | ||
| """""" |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing docstring for helper method. Add a descriptive docstring like: "Helper method to create test folders and upload sessions with specified search criteria. Returns the list of transferred session names."
| """""" | |
| """ | |
| Helper method to create test folders and upload sessions with specified search criteria. | |
| Returns the list of transferred session names. | |
| """ |
|
|
||
| if already_has_wildcard_at_end: | ||
| # Handle edge case where @*@ tag is immediately after @DATETIMETO@ | ||
| # or similar tag. This results in "datetime-**" which cases errors. |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling error: "cases" should be "causes". The comment should read "This results in 'datetime-**' which causes errors."
| # or similar tag. This results in "datetime-**" which cases errors. | |
| # or similar tag. This results in "datetime-**" which causes errors. |
| format_to_check = utils.get_values_from_bids_formatted_name( | ||
| [name], key, return_as_int=False | ||
| )[0] | ||
| except: |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a bare except clause is a bad practice as it catches all exceptions, including SystemExit and KeyboardInterrupt. Replace with a specific exception type, such as except (KeyError, IndexError): to catch expected exceptions when the key is not found in the name.
| except: | |
| except (KeyError, IndexError, ValueError): |
Before submitting a pull request (PR), please read the contributing guide.
Please fill out as much of this template as you can, but if you have any problems or questions, just leave a comment and we will help out :)
Description
What is this PR
Why is this PR needed?
This PR introduces a new @Dateto@ wildcard that enables users to search for folders based on a date range embedded in their names. This feature is especially useful when users want to transfer data recorded within a specific date range, without needing to create folders for every date in that range.
What does this PR do?
Implements @Dateto@ pattern recognition inside search_for_wildcards.
Uses get_values_from_bids_formatted_name to extract date-YYYYMMDD from folder names.
Filters the folders based on whether the date falls within the provided range.
References
#508
How has this PR been tested?
Created automated tests (test_date_search_range) using a simulated folder structure with date-YYYYMMDD format.
Verified that only folders within the specified date range are returned.
Confirmed that existing wildcard functionality remains unaffected.
Is this a breaking change?
No, this feature is additive and does not alter existing behavior.
Does this PR require an update to the documentation?
Yes. The documentation should be updated to mention the new @Dateto@ wildcard and its usage.
If any features have changed, or have been added. Please explain how the
documentation has been updated.
Checklist:
There are two minor mypy errors I couldn't fully resolve:
A type conflict involving the dummy Configs class used in tests — guidance from maintainers would help finalize this.
A type mismatch originating from an existing code path — this appears unrelated to the new functionality added.