Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes an issue with some jobs being parentless #2241

Draft
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

dbeltrankyl
Copy link
Contributor

This PR does two things:

Removes "changed" related code, as the dependencies are remade each time the autosubmit needs the job_list.

And, if there is a parent_less job after all the dependencies addition, autosubmit will add all possible parents as an edge and remove it in an old-fashioned way ( check redundant dependencies ).

The calculation of the "distance" between sections has some flaws, but they are hard to fix. This workaround will cause the graph dependencies addition to be a bit slow in these cases, but it will at least generate the correct graph.

I'll make some tests that check the graph generated

@codecov-commenter
Copy link

codecov-commenter commented Mar 21, 2025

Codecov Report

Attention: Patch coverage is 62.22222% with 17 lines in your changes missing coverage. Please review.

Project coverage is 54.16%. Comparing base (53b2a14) to head (e67a0cf).

Files with missing lines Patch % Lines
autosubmit/job/job_list.py 61.36% 6 Missing and 11 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2241      +/-   ##
==========================================
+ Coverage   53.90%   54.16%   +0.25%     
==========================================
  Files          72       72              
  Lines       17242    17232      -10     
  Branches     3352     3350       -2     
==========================================
+ Hits         9294     9333      +39     
+ Misses       7103     7050      -53     
- Partials      845      849       +4     
Flag Coverage Δ
fast-tests 54.16% <62.22%> (+0.25%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dbeltrankyl
Copy link
Contributor Author

dbeltrankyl commented Mar 21, 2025

Fyi @agarci3

Hello @albertvilabsc , @stamenminkov

I wanted to deploy this branch as a module in the Hubs, and I am having this issue ( I think this happened before )

[eadmin@bsceshub02 easybuild-hub]$ eb autosubmit-4.1.13-2ed6cab-foss-2021b-Python-3.9.6.eb --rebuild
ERROR: Failed to parse configuration options: "Failed to create temporary directory (tmpdir: /dev/shm/tmp): [Errno 20] Not a directory: '/dev/shm/tmp/eb-m9c45zy_'"

Any clue or something I can do? Thanks!

FYI @kinow ,

After manual testing, I wanted to upload this version before I do the automatic tests, as it can take a while ( I have some similar tests that run (on my laptop, not my actual PC, which is a shame ) locally, and the idea was to prepare them to run under the CI/CD starting with the new ones and gradually adding the rest of them)

@albertvilabsc
Copy link

Hi @dbeltrankyl ,

Working on it.

Thanks,

Albert

@albertvilabsc
Copy link

Hi @dbeltrankyl ,

Solved! Can you try again, please?

Thanks,

Albert

@dbeltrankyl
Copy link
Contributor Author

Hi @dbeltrankyl ,

Solved! Can you try again, please?

Thanks,

Albert

I have other issues, but they are related to my configuration. I'll solve it myself

So now it works! thanks

@albertvilabsc
Copy link

Perfect!

Albert

@dbeltrankyl
Copy link
Contributor Author

dbeltrankyl commented Mar 24, 2025

@agarci3 I think the eb recipe should work now; just waiting for the hubs to be up again.

@dbeltrankyl
Copy link
Contributor Author

There is an issue with the keyword "NONE".

This keyword disables a dependency for a specific member, chunk, split, or date.

The fix applied here also affects that keyword, as I didn't contemplate that possibility.

However, @rocsalvadorbsc reported to me that it doesn't work properly on v4.1.12, which I confirmed, and I also tested the same workflow in 4.1.11 where it is working.

I'm using this branch to address the issue in the best way possible. FYI @rocsalvadorbsc

@albertvilabsc
Copy link

@dbeltrankyl this is for me?

Albert

@dbeltrankyl
Copy link
Contributor Author

@dbeltrankyl this is for me?

Albert

No, no. You can unsubscribe from this if you wish ( so you don't get spam).

Thanks Albert!

@albertvilabsc
Copy link

A perfect jeje.

Thanks,

Albert

@dbeltrankyl dbeltrankyl force-pushed the GH-2184-Dependencies-bug branch from 2ed6cab to 383c0fd Compare March 26, 2025 13:27
@dbeltrankyl
Copy link
Contributor Author

It's still not working correctly ( it broke other cases with the fix) , but I have been working on adding an automatic regression test for this.

It's possible that I need to modify something else in the test, but it's fairly automatic. I've added a readme for instructions on how to add new cases @kinow

I'll add the auto-monarch ones, and the Destine ones to the test

@dbeltrankyl
Copy link
Contributor Author

dbeltrankyl commented Mar 28, 2025

  • Improved the test. 26 workflows passing
  • Added an option to plot the experiments ( disabled by default)
  • Added a script to add the conf automatically ( I'll do an autosubmit command based on this in another pr)
  • Ref file is now automatically added if it is not present
  • Added Destine experiments.
  • Added Roc experiment
  • Todo monarch experiments ( maybe in another PR, but I will check If I can add them now)

Dependencies:

  • Rollback the manage_dependencies to v4.1.10. This maintains the fixes of 4.1.11 intact and also is processing the correct workflow ( I believe)

  • Added the fixes for auto-monarch after the rollback, and the 26 workflows are passing

  • Added lint for the whole job_list file

FYI @kinow

@dbeltrankyl
Copy link
Contributor Author

Added @rocsalvadorbsc automatic-profiler

image image

Added monarch ( @agarci3 case)

image

Added e-monarch

image

All three are correct. If not, tell .me

Destine workflows are already confirmed to be correct

For other PR'S

I'll open an issue, asking for CES workflows to add all of them

@dbeltrankyl
Copy link
Contributor Author

This fix is being tested by @franra9.

When he finishes and everything is alright we can merge it

@dbeltrankyl
Copy link
Contributor Author

Missing coverage is related to lint.

Also, this regression test is not yet running automatically on CI/CD ( I will create an issue for it )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants