Search_for_wildcards function updated to add @DATETO@ functionality #510

Diya910 · 2025-04-13T23:43:09Z

Before submitting a pull request (PR), please read the contributing guide.

Please fill out as much of this template as you can, but if you have any problems or questions, just leave a comment and we will help out :)

Description

What is this PR

Bug fix
[ yes] Addition of a new feature
Other

Why is this PR needed?
This PR introduces a new @Dateto@ wildcard that enables users to search for folders based on a date range embedded in their names. This feature is especially useful when users want to transfer data recorded within a specific date range, without needing to create folders for every date in that range.

What does this PR do?
Implements @Dateto@ pattern recognition inside search_for_wildcards.

Uses get_values_from_bids_formatted_name to extract date-YYYYMMDD from folder names.

Filters the folders based on whether the date falls within the provided range.

References

#508

How has this PR been tested?

Created automated tests (test_date_search_range) using a simulated folder structure with date-YYYYMMDD format.

Verified that only folders within the specified date range are returned.

Confirmed that existing wildcard functionality remains unaffected.

Is this a breaking change?

No, this feature is additive and does not alter existing behavior.

Does this PR require an update to the documentation?

Yes. The documentation should be updated to mention the new @Dateto@ wildcard and its usage.

If any features have changed, or have been added. Please explain how the
documentation has been updated.

Checklist:

[ yes] The code has been tested locally
[ yes] Tests have been added to cover all new functionality
The documentation has been updated to reflect any changes
The code has been formatted with pre-commit

There are two minor mypy errors I couldn't fully resolve:
A type conflict involving the dummy Configs class used in tests — guidance from maintainers would help finalize this.
A type mismatch originating from an existing code path — this appears unrelated to the new functionality added.

Diya910 · 2025-04-17T21:10:55Z

@adamltyson @JoeZiminski
Is there any update on the pull request. Your feedback will be really helpful.

sumana-2705 · 2025-04-22T18:19:40Z

Hello @Diya910,
The changes are looking great. The review process might be a little delayed since the team is currently a bit busy. In the meantime, it might be a good idea to take a look at the documentation as well, in the transfer_data.md file :)

JoeZiminski · 2025-05-04T16:47:27Z

Hi @Diya910 so sorry for the delay in response! thanks a lot for this PR and the extensive tests. I'm still not back full time but will definitely have time to review this within the next two weeks. Thanks for your patience

JoeZiminski

Hey @Diya910 thanks a lot for this, its a really nice implementation and is exactly what we need to do in this case. I have left a few comments on refactoring, this is because the introduced functionality can be aligned with some existing code to reduce duplication across the codebase. This requires some massaging of existing datashuttle code to make it a little more general so it can be called here. The suggestions also extend the implementation to handle the TIMETO and DATETIMETO case. For now I have not reviewed the tests as they might need changing after the refactor, but in general they look good and the attention to detail on testing is much appreciated.

Let me know if anything is not clear and if you have any questions or alternative ways to tackle this. Refactorings like those suggested can be a little fiddly. The linting / type checking will be useful when performing such refactorings. Of course, I'm happy to help wherever it would be useful. Thanks again for this contribution!

Just a reminder to myself, we will also need to add documentation for this new functionality.

datashuttle/utils/folders.py

JoeZiminski · 2025-05-14T14:40:45Z

datashuttle/utils/folders.py

+        if canonical_tags.tags("*") in name or "@DATETO@" in name:
+            search_str = name.replace(canonical_tags.tags("*"), "*")
+            # If a date-range tag is present, extract dates and update the search string.
+            if "@DATETO@" in name:


We have a canonical tags.tags() function that contains all the tags (just in case we change them or some other problem that requires their editing arises). So @DATETO@, @TIMETO@ and @DATETIMETO@ could be added to that function and here @DATETO@ replaced with tags.tags("DATETO")

Added DATETO, TIMETO, and DATETIMETO to canonical_tags.py and using them through tags()

datashuttle/utils/folders.py

JoeZiminski · 2025-06-06T12:49:23Z

Hey @Diya910 do you think you would be interested in continuing to work on this PR? This is a great addition and it would be nice to release it in a version soon. I'm happy to finalise the PR as most of the work now is just refactoring into the existing codebase.

Diya910 · 2025-06-06T13:15:09Z

Hey @Diya910 do you think you would be interested in continuing to work on this PR? This is a great addition and it would be nice to release it in a version soon. I'm happy to finalise the PR as most of the work now is just refactoring into the existing codebase.

Yes yes, I am interested. I was busy with my exams and other stuffs. Just allow me a day or two. I'll do the required changes suggested by you.

JoeZiminski · 2025-06-06T13:21:49Z

Hey @Diya910 great! No rush BTW I was just checking in, please prioritise exams / other stuff / taking some time to recuperate after exams. I was thinking it might be nice to merge over the next few weeks (rather than next few days), thanks!

Diya910 · 2025-06-06T13:24:23Z

Thanks, I'll try to work on it as soon as possible.

…ion of code my making functions in validation.py and using in search_with_tags feature in folders file

Diya910 · 2025-06-15T15:11:25Z

Hey, @JoeZiminski I have probably done all the changes suggested by you and also centralized the code. I have also changed the test file with additional test functions, everything is working fine from side. If any other changes are required, please let me know. I will do them at the earliest.

JoeZiminski · 2025-06-18T13:53:19Z

Hi @Diya910 thanks a lot for this! Will review tomorrow

JoeZiminski

Hey @Diya910 thanks for this, this is really great stuff. The code is very clean, this is going to make a great feature. I have left a few comments on the code, they just suggest some minor refactoring's to reduce code duplication where possible. For critical code, it makes sense to define the key parts only in once place, just in case they are changed later but the editor forgets to check for all places they are defined.

The tests are great for ensuring the features works well, I have suggested a refactoring here to use our existing testing machinery which I think should reduce some boilerplate, let me know if you have any questions about this. The tests will should probably test all three cases, dateto, timeto and datetimteto, happy to help with this.

I just pushed some fixes to the pre-commit on the CI which was failing, just some minor typing issues (see here for some detail on the pre-commit hooks). This should move on to the full test suite now.

Thanks again Diya this is nearly done! I just remembered we will also need to document this change, the contributing guide for this is here. It would make sense to add the new tags to this section. Happy to do this because the documentation can be a bit fiddly, but if you are interested in this please feel free to go ahead, let me know if you have any questions!

datashuttle/configs/canonical_tags.py

datashuttle/utils/validation.py

datashuttle/utils/folders.py

tests/test_date_search_range.py

Diya910 · 2025-07-02T18:06:26Z

Hey @JoeZiminski, I have done changes required by you. These were a lot of changes I am not able to reply to all of them individually. But I made sure to make changes suggested by you. I have tested the changes on draft test file and they are working fine. I haven't properly done work on test file. It was a lot for me to do in a go. Once you confirm these changes I'll move ahead in refactoring test file. I hope you are fine with it. If I missed any suggestion above just in case, please point out to that I'll make those changes.

JoeZiminski

Hey @Diya910 this is great, definitely good to go bar some very minor suggestions. Most of these are minor github code suggestions so you can directly commit them.

Apologies, one of my suggestions was actually worse than what was already there 😅 around the walrus operator. Sorry for the inconvenience of having to revert this.

After these changes are integrated I will message @Akseli-Ilmanen to test this manually while the other tests are been written. Let me know if you have any questions as you refactor the tests. Thanks again!

datashuttle/utils/folders.py

datashuttle/utils/validation.py

datashuttle/configs/canonical_tags.py

Co-authored-by: Joe Ziminski <[email protected]>

…per review

…into date_feature

Diya910 · 2025-07-04T08:09:48Z

@JoeZiminski I have done all the changes. Please have a look. I am not sure about if I have removed declarations the right way. Please let me know if you want me to change docstrings in any specific way. Thankyou

for more information, see https://pre-commit.ci

…into date_feature

for more information, see https://pre-commit.ci

…into date_feature

JoeZiminski · 2025-10-30T14:48:52Z

Hey @Akseli-Ilmanen thanks for these suggestions, on this PR it should now be possible to do things like:

from datashuttle import DataShuttle

project = DataShuttle("my_project")

project.create_folders("rawdata", "sub-@DATE@", "ses-@DATETIME@", ["behav", "ephys"], allow_letters_in_sub_ses_values=True)

# collect some data

project.upload_custom("rawdata", "all", "ses-20251030T134534@DATETIMETO@20251030T134544", "all")

And equivalently this can be done through the TUI. Note that for the transfer to work, every ses- must have the same format (e.g. all must be ses-<a datetime> or ses-<a date> or ses-<a time> but not a mix (similar for sub-).

Would be great to hear how this works for you!

You can install from this PR by doing:

pip uninstall datashuttle
git clone [email protected]:Diya910/datashuttle.git
cd datashuttle
git checkout date_feature
git pull
pip install -e .

Copilot

Pull Request Overview

This PR implements datetime range filtering functionality for datashuttle, allowing users to select folders based on date, time, or datetime ranges. The feature introduces three new wildcard tags (@Dateto@, @TimeTo@, @DATETIMETO@) and refactors datetime handling throughout the codebase for improved consistency and maintainability.

Key changes:

Added datetime range search functionality via new @Dateto@, @TimeTo@, and @DATETIMETO@ tags
Refactored datetime formatting to separate value generation from key-prefixing (e.g., format_datetime vs format_datetime_with_key)
Improved regex patterns for datetime validation using more concise notation (\d{8} vs \d\d\d\d\d\d\d\d)

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
tests/test_date_search_range.py	New comprehensive test suite for datetime range search functionality covering date, time, and datetime ranges with edge cases
tests/tests_integration/test_validation.py	Added integration tests for datetime tag validation after sub/ses keys
tests/tests_unit/test_validation_unit.py	Updated unit tests to reflect more concise regex notation for datetime patterns
datashuttle/configs/canonical_tags.py	Added new datetime range tags and centralized datetime format definitions
datashuttle/utils/validation.py	Refactored datetime validation to use centralized format definitions and extracted ISO format validation
datashuttle/utils/formatting.py	Refactored datetime formatting to support both keyed and non-keyed formats, handling sub-/ses- prefixed datetime values
datashuttle/utils/folders.py	Implemented core datetime range search logic with filtering, validation, and glob pattern generation; renamed `search_for_wildcards` to `search_with_tags`
datashuttle/utils/data_transfer.py	Updated function call to use renamed `search_with_tags` function
pyproject.toml	Changed mypy configuration to use overrides syntax for ignoring test errors

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-20T15:14:09Z

tests/test_date_search_range.py

+        assert sorted(transferred_sessions) == sorted(expected_sessions)
+
+    def test_date_as_sub_or_ses_value(self, project):
+        """ """


Missing docstring for test method. This test verifies that date values can be used directly in subject or session names (without the "date-" prefix). Add a descriptive docstring like: "Test date range filtering when dates are used directly as sub/ses values (e.g., ses-20240301)."

Suggested change

""" """

"""Test date range filtering when dates are used directly as sub/ses values (e.g., ses-20240301)."""

Copilot · 2025-11-20T15:14:10Z

datashuttle/utils/formatting.py

 ) -> None:
    """Replace tags with their final value for every name in a list.

+    @DATE@, @TIME@ and @DATETIME@ keys can be positioed directly


Spelling error: "positioed" should be "positioned".

Suggested change

@DATE@, @TIME@ and @DATETIME@ keys can be positioed directly

@DATE@, @TIME@ and @DATETIME@ keys can be positioned directly

Copilot · 2025-11-20T15:14:10Z

datashuttle/utils/folders.py

+
+def find_datetime_in_name(
+    name: str, format_type: str, tag: str
+) -> tuple[str | Any, ...] | None:


Inconsistent return type annotation. The function signature declares -> tuple[str | Any, ...] | None but the docstring says tuple[str, str] | None. The actual return type should be tuple[str, str] | None since match.groups() from a pattern with two capture groups returns a tuple of two strings. Update the function signature to -> tuple[str, str] | None.

Suggested change

) -> tuple[str | Any, ...] | None:

) -> tuple[str, str] | None:

Copilot · 2025-11-20T15:14:10Z

tests/test_date_search_range.py

+    def test_without_wildcard_ses(self, project):
+        """Test without wildcard ses.
+
+        Including @*@ only led to an uncaught but as it was triggering a


Spelling error: "but" should be "bug". The comment should read "Including @*@ only led to an uncaught bug..."

Suggested change

Including @*@ only led to an uncaught but as it was triggering a

Including @*@ only led to an uncaught bug as it was triggering a

Copilot · 2025-11-20T15:14:11Z

tests/test_date_search_range.py

+        assert sorted(transferred_sessions) == sorted(expected_sessions)
+
+    def test_time_as_sub_or_ses_value(self, project):
+        """ """


Missing docstring for test method. Add a descriptive docstring like: "Test time range filtering when times are used directly as sub/ses values (e.g., ses-110101)."

Suggested change

""" """

"""Test time range filtering when times are used directly as sub/ses values (e.g., ses-110101)."""

Copilot · 2025-11-20T15:14:11Z

tests/test_date_search_range.py

+        assert sorted(transferred_sessions) == sorted(expected_sessions)
+
+    def test_datetime_as_sub_or_ses_value(self, project):
+        """ """


Missing docstring for test method. Add a descriptive docstring like: "Test datetime range filtering when datetimes are used directly as sub/ses values (e.g., ses-20240301T110101)."

Suggested change

""" """

"""Test datetime range filtering when datetimes are used directly as sub/ses values (e.g., ses-20240301T110101)."""

Copilot · 2025-11-20T15:14:11Z

tests/test_date_search_range.py

+    def run_session_upload(
+        self, project, subs, sessions, session_search_string
+    ):
+        """"""


Missing docstring for helper method. Add a descriptive docstring like: "Helper method to create test folders and upload sessions with specified search criteria. Returns the list of transferred session names."

Suggested change

""""""

"""

Helper method to create test folders and upload sessions with specified search criteria.

Returns the list of transferred session names.

"""

Copilot · 2025-11-20T15:14:12Z

datashuttle/utils/folders.py

+
+    if already_has_wildcard_at_end:
+        # Handle edge case where @*@ tag is immediately after @DATETIMETO@
+        # or similar tag. This results in "datetime-**" which cases errors.


Spelling error: "cases" should be "causes". The comment should read "This results in 'datetime-**' which causes errors."

Suggested change

# or similar tag. This results in "datetime-**" which cases errors.

# or similar tag. This results in "datetime-**" which causes errors.

Copilot · 2025-11-20T15:14:12Z

datashuttle/utils/validation.py

            format_to_check = utils.get_values_from_bids_formatted_name(
                [name], key, return_as_int=False
            )[0]
        except:


Using a bare except clause is a bad practice as it catches all exceptions, including SystemExit and KeyboardInterrupt. Replace with a specific exception type, such as except (KeyError, IndexError): to catch expected exceptions when the key is not found in the name.

Suggested change

except:

except (KeyError, IndexError, ValueError):

{Search for wildcards function updated}

3503808

JoeZiminski requested changes May 14, 2025

View reviewed changes

Refactoring changes asked by the maintainer which include centralisat…

054a98b

…ion of code my making functions in validation.py and using in search_with_tags feature in folders file

JoeZiminski added this to the v2.8.0 milestone Jun 17, 2025

JoeZiminski requested changes Jun 19, 2025

View reviewed changes

JoeZiminski linked an issue Jun 21, 2025 that may be closed by this pull request

Search within date range #508

Open

JoeZiminski mentioned this pull request Jun 26, 2025

Developer documentation #533

Open

Fabrication of the code moved functions in folders.py

69699d9

Diya910 force-pushed the date_feature branch from d038ac8 to 69699d9 Compare July 2, 2025 17:49

JoeZiminski requested changes Jul 3, 2025

View reviewed changes

JoeZiminski mentioned this pull request Jul 3, 2025

Check tests for validation #539

Closed

JoeZiminski reviewed Jul 3, 2025

View reviewed changes

datashuttle/configs/canonical_tags.py Outdated Show resolved Hide resolved

Diya910 and others added 7 commits July 4, 2025 12:50

Update datashuttle/utils/folders.py

15f8a3c

Co-authored-by: Joe Ziminski <[email protected]>

Update datashuttle/utils/folders.py

8104024

Co-authored-by: Joe Ziminski <[email protected]>

Update datashuttle/utils/folders.py

8a2cbbd

Co-authored-by: Joe Ziminski <[email protected]>

Update datashuttle/utils/folders.py

a44cb61

Co-authored-by: Joe Ziminski <[email protected]>

Update datashuttle/utils/validation.py

f9a21b4

Co-authored-by: Joe Ziminski <[email protected]>

Refactor: Clean up docstrings in folders.py and canonical_tags.py as …

bd12cd6

…per review

Merge branch 'date_feature' of https://github.com/Diya910/datashuttle …

e0d2441

…into date_feature

pre-commit-ci bot and others added 21 commits October 27, 2025 12:56

[pre-commit.ci] auto fixes from pre-commit.com hooks

84524ec

for more information, see https://pre-commit.ci

Introduce get_datetime_to_search_regexp function.

7da6305

Merge branch 'date_feature' of https://github.com/Diya910/datashuttle …

444a081

…into date_feature

Remove 'test_simple_wildcard_first'.

c60b5ff

Update 'test_datetime_range_transfer'.

1893fba

Use parameterisation for download test.

120c4d6

Update error message check.

94161e2

Add 'upload_or_download' for all key tests.

0a936a2

Extend test_subject_level_date_range.

9d1fe46

Update test_edge_case_exact_boundary_dates.

86107d0

Adjust date test.

c49403f

Small tidy ups.

08d77d8

Troubleshooting.

8d9d96b

Merge remote-tracking branch 'upstream/main' into date_feature

f393aa8

Support datetime directly as sub ses values @datetime@.

2e45e2b

Add tests for new sub-@Date@ ses-@Date@ behaviour.

cc602a8

[pre-commit.ci] auto fixes from pre-commit.com hooks

8416bbe

for more information, see https://pre-commit.ci

Extend tests to new sub case + some refactoring.

06933ac

Fix docstring for linter.

28335db

Handle sub case for datetime, date, time.

d42d927

Merge branch 'date_feature' of https://github.com/Diya910/datashuttle …

2faf0a2

…into date_feature

JoeZiminski linked an issue Nov 1, 2025 that may be closed by this pull request

Support @DATE@, @TIME@ and @DATETIME@ to insert values after sub- and ses- values #605

Open

JoeZiminski removed this from the v2.8.0 milestone Nov 1, 2025

JoeZiminski added 2 commits November 3, 2025 23:19

Merge remote-tracking branch 'upstream/main' into date_feature

dc47915

Fix tests.

b59eb2f

JoeZiminski requested a review from Copilot November 20, 2025 15:09

Copilot started reviewing on behalf of JoeZiminski November 20, 2025 15:10 View session

Copilot finished reviewing on behalf of JoeZiminski November 20, 2025 15:12

Copilot AI reviewed Nov 20, 2025

View reviewed changes

	""" """
	"""Test date range filtering when dates are used directly as sub/ses values (e.g., ses-20240301)."""

	@DATE@, @TIME@ and @DATETIME@ keys can be positioed directly
	@DATE@, @TIME@ and @DATETIME@ keys can be positioned directly

	) -> tuple[str \| Any, ...] \| None:
	) -> tuple[str, str] \| None:

	Including @*@ only led to an uncaught but as it was triggering a
	Including @*@ only led to an uncaught bug as it was triggering a

	""" """
	"""Test time range filtering when times are used directly as sub/ses values (e.g., ses-110101)."""

	""" """
	"""Test datetime range filtering when datetimes are used directly as sub/ses values (e.g., ses-20240301T110101)."""

-        """"""
+        """
+        Helper method to create test folders and upload sessions with specified search criteria.
+        Returns the list of transferred session names.
+        """

	# or similar tag. This results in "datetime-**" which cases errors.
	# or similar tag. This results in "datetime-**" which causes errors.

Search_for_wildcards function updated to add @DATETO@ functionality #510

Are you sure you want to change the base?

Search_for_wildcards function updated to add @DATETO@ functionality #510

Uh oh!

Conversation

Diya910 commented Apr 13, 2025

Description

References

How has this PR been tested?

Is this a breaking change?

Does this PR require an update to the documentation?

Checklist:

Uh oh!

Diya910 commented Apr 17, 2025

Uh oh!

sumana-2705 commented Apr 22, 2025

Uh oh!

JoeZiminski commented May 4, 2025

Uh oh!

JoeZiminski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JoeZiminski May 14, 2025

Choose a reason for hiding this comment

Uh oh!

Diya910 Jun 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JoeZiminski commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Diya910 commented Jun 6, 2025

Uh oh!

JoeZiminski commented Jun 6, 2025

Uh oh!

Diya910 commented Jun 6, 2025

Uh oh!

Diya910 commented Jun 15, 2025

Uh oh!

JoeZiminski commented Jun 18, 2025

Uh oh!

JoeZiminski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Diya910 commented Jul 2, 2025

Uh oh!

JoeZiminski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Diya910 commented Jul 4, 2025

Uh oh!

JoeZiminski commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

JoeZiminski commented Jun 6, 2025 •

edited

Loading

JoeZiminski commented Oct 30, 2025 •

edited

Loading