Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Estimators failing test check_fit_deterministic #2457

Open
5 tasks
TonyBagnall opened this issue Dec 16, 2024 · 5 comments · May be fixed by #2575
Open
5 tasks

[ENH] Estimators failing test check_fit_deterministic #2457

TonyBagnall opened this issue Dec 16, 2024 · 5 comments · May be fixed by #2575
Labels
enhancement New feature, improvement request or other non-bug code enhancement

Comments

@TonyBagnall
Copy link
Contributor

Describe the feature or idea you want to propose

fails of check_fit_deterministic are usually related to seeding, and looking at the list may be numba related

  • "SASTClassifier"
  • "RSASTClassifier"
  • "SAST"
  • "RSAST"
  • "SFA"

Describe your proposed solution

fix or create issue

Describe alternatives you've considered, if relevant

No response

Additional context

No response

@TonyBagnall TonyBagnall added the enhancement New feature, improvement request or other non-bug code enhancement label Dec 16, 2024
@aryanpola
Copy link
Contributor

@TonyBagnall how do I reproduce these errors?

@lucifer4073 lucifer4073 linked a pull request Feb 26, 2025 that will close this issue
5 tasks
@lucifer4073
Copy link
Contributor

lucifer4073 commented Feb 26, 2025

Hi @TonyBagnall,

I went through the check_fit_deterministic() function and clearly the issue was uneven seeding.

There are two approaches to resolve this:

  1. Check for a seed attribute in the clone_estimator function; if it exists, initialize it with random_state, as done in PR [ENH] Resolve failing check_fit_deterministic #2575.
  2. The second approach, which I believe might not be ideal, involves checking if seed is not None and assigning it a value during classifier initialization.

Bottleneck in the code:

self._random_state = (
    np.random.RandomState(self.seed)
    if not isinstance(self.seed, np.random.RandomState)
    else self.seed
)

@baraline
Copy link
Member

baraline commented Feb 26, 2025

I think the correct way to solve this issue would rather be to change these estimators non "aeon-standard" seed parameter by random_state parameter. The issue is that check_fit_deterministic expects a random_state parameters and not seed :

def check_fit_deterministic(estimator, datatype):
    """Check that calling fit twice is equivalent to calling it once."""
    estimator = _clone_estimator(estimator, random_state=0)

If the estimator don't use a random state parameter that might be the cause of the issue I think ? We would need to put up a deprecation notice if we do that.

@lucifer4073
Copy link
Contributor

I think the correct way to solve this issue would rather be to change these estimators non "aeon-standard" seed parameter by random_state parameter. The issue is that check_fit_deterministic expects a random_state parameters and not seed :

def check_fit_deterministic(estimator, datatype):
"""Check that calling fit twice is equivalent to calling it once."""
estimator = _clone_estimator(estimator, random_state=0)
If the estimator don't use a random state parameter that might be the cause of the issue I think ? We would need to put up a deprecation notice if we do that.

Thanks for the reply.

As far my understanding, I shall change the "seed" parameter to random_state. Regarding the deprecation notice, I’m not entirely sure about the best approach.

@lucifer4073
Copy link
Contributor

Hi @TonyBagnall @baraline ,

I have made the requested changes. Please have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature, improvement request or other non-bug code enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants