Conversation
This reverts commit a4181b0.
* refactor workgraph.py
Apply comprehensive refactoring patterns to workgraph.py to reduce
cognitive complexity, improve type safety, and enhance maintainability.
STRUCTURAL CHANGES
- Introduce frozen dataclasses to replace fragile tuple structures:
* DependencyInfo(dep_label, filename, data_label): replaces 3-element
tuples used throughout dependency tracking chain
* InputDataInfo: typed metadata for input data items (7 fields)
* OutputDataInfo: typed metadata for output data items (5 fields)
- Add type aliases for complex nested structures:
* PortToDependencies: dict[str, list[DependencyInfo]]
* ParentFolders, JobIds, TaskDepInfo, LauncherDependencies
COMPLEXITY REDUCTION
- Extract dependency processing helpers to reduce nesting:
* _resolve_icon_dependency(): encapsulates 3-case logic (filename
known, namelist resolution, workdir fallback)
* _create_shell_remote_data(): creates RemoteData with consistent
patterns
* Reduces maximum indentation from 5+ levels to 2-3 levels
- Apply early-continue pattern in dependency processing loops:
* load_icon_dependencies(): flattened control flow
* load_and_process_shell_dependencies(): reduced nested conditionals
DRY PRINCIPLE APPLICATION
- Consolidate SLURM dependency directive building:
* _build_slurm_dependency_directive(): single source of truth for
"#SBATCH --dependency=afterok:..." construction
* _add_custom_scheduler_command(): unified command appending logic
* Eliminates duplication between
build_icon_metadata_with_slurm_dependencies() and
build_shell_metadata_with_slurm_dependencies()
UTILITY FUNCTIONS
- Add mapping helpers to reduce boilerplate:
* _map_list_append(): replaces repeated setdefault().append() pattern
* _map_unique_set(): conditional insertion with existence check
* Applied in build_dependency_mapping() for cleaner intent
- Add logging wrappers to separate concerns:
* _log_dependency_processing(): consistent dependency logging
* _log_remote_data_details(): structured RemoteData creation logging
TYPE SAFETY IMPROVEMENTS
- Update build_shell_task_spec() to use dataclasses:
* Change input_data_info from list[dict] to list[InputDataInfo]
* Change output_data_info from list[dict] to list[OutputDataInfo]
* Replace dict access patterns (info["field"]) with attribute access
(info.field)
* Enables IDE autocomplete and static type checking
AFFECTED FUNCTIONS
Modified:
- build_dependency_mapping(): use DependencyInfo, mapping helpers
- load_icon_dependencies(): use helper, type aliases, early-continue
- load_and_process_shell_dependencies(): use helper, type aliases,
early-continue
- build_icon_metadata_with_slurm_dependencies(): use SLURM helpers
- build_shell_metadata_with_slurm_dependencies(): use SLURM helpers
- build_shell_task_spec(): use InputDataInfo/OutputDataInfo dataclasses
Added:
- _resolve_icon_dependency()
- _create_shell_remote_data()
- _build_slurm_dependency_directive()
- _add_custom_scheduler_command()
- _map_list_append()
- _map_unique_set()
- _log_dependency_processing()
- _log_remote_data_details()
REFACTORING PATTERNS APPLIED
1. Replace tuples with typed dataclasses for self-documentation
2. Extract per-dependency logic into focused helper functions
3. Precompute lookup tables to eliminate nested scanning (already
optimized)
4. Apply early-continue pattern to reduce indentation depth
5. Consolidate repeated logic following DRY principle
6. Separate logging concerns from business logic
7. Use type aliases to clarify complex nested structures
8. Add utility functions for common mapping operations
* Fix JSON-serialization and upgrade aiida-core
* add branch-independence test case
* nicer test-setup with env var substitution and `run_via_cli.sh`
* delete `large-local` directory
* wip; pre-submission seems to work nicely for the simpler branch-independence workflow
| from sirocco.parsing._utils import TimeUtils, convert_to_date, convert_to_duration | ||
|
|
||
| if TYPE_CHECKING: | ||
| from collections.abc import Iterator |
There was a problem hiding this comment.
Hm, this seems to be fine for me now in TYPE_CHECKING block
Move pytest marker filter from extra-args to default-args in hatch config so CI workflows can override it. Include slow tests in integration CI. Merge redundant test_branch_independence_execution into the parametrized test_branch_independence_with_front_depths.
Also: - Add pytest-xdist for parallel test execution (-n auto) - Consolidate duplicate Shell/Icon tests with parametrization - More pytest.param
| repr=False, | ||
| metadata={"description": "Path to wrapper script file relative to the config directory or absolute."}, | ||
| ) | ||
| # NOTE: This is hard-coded still to CSCS... |
There was a problem hiding this comment.
Yes, this is on purpose because these are the only implemented targets both for standalone or aiida-icon even though target currently only makes sense for standalone. It will change since we want to unify both specification with:
tasks:
- icon
plugin: icon
computer: santis
target: gpuThen we can decide where the check happens for supported targets. Maybe that can be split between standalone and aiida-icon to allow some divergence in the support status. Still if we want to keep it centralized at the parsing level, we can make use of the new engine field in a new @model_validator(mode="after") decorator on the ConfigWorkflow class.
There was a problem hiding this comment.
Just a general note: I think we should be completely agnostic about this in the development of the tool. Everyting center specific should be configurable by users. Otherwise we just create a single center-specific tool, which is how we ended up with 300 different workflow managers in the first place ^^
tasks:
- icon
plugin: icon
computer: santis
target: gpuI think this is good. We could even change computer to hostname, as computer is more AiiDA terminology, and hostame, as it is in the SSH config is more known to people (and AiiDA now also supports OpenSSH directly).
| @staticmethod | ||
| def _add_sirocco_time_prepend_text(options: AiidaMetadataOptions, task: core.Task) -> AiidaMetadataOptions: | ||
| """Append chunk start/stop exports to prepend_text when date cycling is available.""" | ||
| start_date = task.cycle_point.chunk_start_date.isoformat() # type: ignore[attr-defined] | ||
| stop_date = task.cycle_point.chunk_stop_date.isoformat() # type: ignore[attr-defined] | ||
|
|
||
| exports = f"export SIROCCO_START_DATE={start_date}\nexport SIROCCO_STOP_DATE={stop_date}\n" | ||
|
|
||
| current_prepend = options.prepend_text or "" | ||
| new_prepend = f"{current_prepend}\n{exports}" if current_prepend else exports | ||
|
|
||
| return options.model_copy(update={"prepend_text": new_prepend}) |
There was a problem hiding this comment.
Yes, this the way it's used at the moment. Yet Christian asked me in the mean time if we could also export the overall dates in the runscript. Either we keep it for later or we already export SIROCCO_CHUNK_START_DATE, SIROCCO_CHUNK_STOP_DATE, SIROCCO_START_DATE, and SIROCCO_STOP_DATE (and adapt the corresponding variable name in the tests) in this PR. I don't have a strong feeling, I just need to remember to change it in both engines if I do it later.
Co-authored-by: Matthieu Leclair <matthieu.leclair@c2sm.ethz.ch>
Co-authored-by: Matthieu Leclair <matthieu.leclair@c2sm.ethz.ch>
Implements a complete AiiDA-based execution engine that translates Sirocco workflows into WorkGraph structures for submission with AiiDA. The engine handles dependency resolution, SLURM job chaining, and rolling window execution.
A full design document can be found in
ADR/AiiDA_Engine_Architecture.md, rendered here.Compiled stubs from the `src/sirocco/engines/aiida` module for API and design discussion
__init__adapterbuildercalcjob_builderscode_factorydependency_resolversexecutemodelsmonitoringpatches.__init__patches.firecrest_symlinkpatches.slurm_dependenciespatches.workgraph_windowspec_buildersspec_resolverstask_pairstopologytypesutilsRemaining TODOs (within this PR)
config.ymlfront_depthmeaning (has to be strictly positive integer -> 1 == what I used to call sequential - no pre-submission)Remaining TODOs (after this PR)
vars.yml, but exposetemplate_file.yml(or similar) to the userworkflow_file, instead point toconfig_directory(config dir is one entity, e.g., zip it and share with others)standalone(@leclairm)uenvviauenv run, not via SLURM header, to allow for multipleuenvs being used in one runbasheverything, always prependbashpath:also to verify that. Ifpathnot given -> we don't have to copy anything.Notes from the meeting
config.ymlonly source of truth. Don't exposeengine,scheduler,front_depthvia CLIrun.shscripts at this pointTop-level list of features
workgraph.pyto enable SLURM pre-submission with dynamic windowing (using a functional approach)scope creeps
config.ymlfiles (template values are given viavars.yml; running test cases as examples viarun.shscripts)WIP
previously noted TODOs
get_job_data: no QB, but pass WG pk directly -> this does not work, as the WG does not exist at this stage yet, only its future label, hence the QB is necessaryverdi process dumpfor Sirocco WG (has to be done in aiida-core ... TODO: add as patch)largeworkflow (real ICON, can be sleep for pre- and post-proc)nextandprevioussymlinks for symlink tree on HPC