-
Notifications
You must be signed in to change notification settings - Fork 6
Issue 57: 7.3.3 Flows #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
48 commits
Select commit
Hold shift + click to select a range
5b455c5
Init commit for 7.3.3 workflows. Created a basic move.py script that …
davramov 2d7ff3a
Fixed comment formatting in pytest
davramov de3e974
Updating documentation
davramov 8eedaca
Updating documentation and added a dispatcher for bl733
davramov 747c412
Continuing to develop the 733 flows. Updated docs, added comments ind…
davramov ec968aa
Updating documentation
davramov 3fb6361
Adding globus prune_controller logic from PR#59 to move.py for now. O…
davramov 8c2e38b
Removing 'self' as input parameter to prune() since it is not part of…
davramov c195c60
Adding a mocker.patch for schedule_prefect_flow in test_globus_flow.…
davramov ddbecbf
Updating prefect deployments... although I should make sure this work…
davramov c86522f
Created a new file 'create_deployments_733_3.4.2.sh' for deploying Pr…
davramov fbecede
Making sure the correct function is registered for pruning. Making su…
davramov 707409c
Adding file name to the dispatcher flow_run_name
davramov d85c163
Adding UUID for ALS Transfer 733 endpoint 💾
davramov b60869f
fixing work_pool yaml entries in orchestration/flows/bl733/prefect.yaml
davramov 1899983
Fixing flow run name formatting
davramov d5f1a13
Adding BEAMLINE env variable to login script
davramov fcaae9d
adjusting main method file_path to test
davramov 1897e33
Commenting out the delete block to ensure no data is accidentally del…
davramov 27a06cf
Adjusting the instructions in the docsctring for the correct usage of…
davramov 124354f
Adding the init_work_pools.sh script from the prefect 3 PR to help wi…
davramov 9a8f641
Commenting out Prefect JSON block stuff, since it is becoming depreca…
davramov 8e9ee98
Adjusting docstring to indicate days_from_now expects a float, not a …
davramov 70c4fcb
Rewriting the init_work_pools script to be in python rather than shel…
davramov 7fc4f13
Moving old flow registering scripts to scripts/legacy/
davramov 683a315
Moving old flow registering scripts to scripts/legacy/
davramov 454f34e
Updating init_work_pools.py to set the GLOBUS_CLIENT_ID and GLOBUS_CL…
davramov 398bfe9
Adding new_file_733_flight_check flow (test Globus transfer from data…
davramov 09b2341
Adding a todo item for updating the config type checking once PR#62 i…
davramov 3749ad4
updating documentation
davramov b6fb8d5
pytest patches for prefect 3
davramov ad84b40
Updating diagram to fit vertically instead of horizontally
davramov a98e139
Updating diagram to fit vertically instead of horizontally
davramov 4b1c05c
fixing cut off text in diagram
davramov a3b8300
Updating globus endpoint names in config.yaml to use beamline name (b…
davramov 4a14e3c
removing bl733 from the test_globus_flow.py pytest
davramov 51ed315
moving pytests to a specific test folder for bl733
davramov f361d4d
making the move flow call the move task, making dispatcher call the m…
davramov abeba68
Updating dockerfile to pull prefect 3.4.2
davramov 66cad5d
Making sure logger = get_run_logger() within the functions for better…
davramov 67c033e
Adding flow_run_name to schedule_prefect_flow call in the prune method
davramov 44fbe4c
Adding Lamarr (ALS Computing Beamlines/Global) as a transfer endpoint…
davramov 9002ba0
Removing legacy create_deployments scripts
davramov 75b0e14
Fixing config.yml after rebase
davramov f1b8fdb
removing old move_733.py script
davramov 479e518
using Variable Blocks for settings
davramov 84fa884
removing delete_spot733 option for bl733-settings prefect variable (n…
davramov 5b27406
Loading max_wait_settings from Prefect Variable block
davramov File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,207 @@ | ||
| # Beamline 7.3.3 Flows | ||
|
|
||
| This page documents the workflows supported by Splash Flows Globus at [ALS Beamline 7.3.3 (SAXS/WAXS/GISAXS)](https://saxswaxs.lbl.gov/user-information). Beamline 7.3.3 supports hard x-ray scattering techniques include small- and wide-angle x-ray scattering (SAXS/WAXS) and grazing-incidence SAXS/WAXS (GISAXS/GIWAXS). | ||
|
|
||
| ## Diagrams | ||
|
|
||
| ### Sequence Diagram | ||
| ```mermaid | ||
| sequenceDiagram | ||
| participant T as Trigger<br/>Components | ||
| participant F as Prefect<br/>Flows | ||
| participant S as Storage &<br/>Processing | ||
|
|
||
| %% Initial Trigger | ||
| T->>T: Detector → File Watcher | ||
| T->>F: File Watcher triggers Dispatcher | ||
| F->>F: Dispatcher coordinates downstream Flows | ||
|
|
||
| %% Flow 1: new_file_733 | ||
| rect rgb(220, 230, 255) | ||
| note over F,S: FLOW 1: new_file_733 | ||
| F->>S: Access data733 | ||
| S->>S: Globus Transfer to NERSC CFS | ||
| S->>S: Ingest metadata to SciCat | ||
| end | ||
|
|
||
| %% Flow 2: HPSS Transfer | ||
| rect rgb(220, 255, 230) | ||
| note over F,S: FLOW 2: Scheduled HPSS Transfer | ||
| F->>S: Access NERSC CFS | ||
| S->>S: SFAPI Transfer to HPSS Tape | ||
| S->>S: Ingest metadata to SciCat | ||
| end | ||
|
|
||
| %% Flow 3: HPC Analysis | ||
| rect rgb(255, 230, 230) | ||
| note over F,S: FLOW 3: HPC Downstream Analysis | ||
| F->>S: Access data733 | ||
| S->>S: Globus Transfer to HPC | ||
| S->>S: Run HPC Compute Processing | ||
| S->>S: Return scratch data to data733 | ||
| end | ||
|
|
||
| %% Flow 4: Scheduled Pruning | ||
| rect rgb(255, 255, 220) | ||
| note over F,S: FLOW 4: Scheduled Pruning | ||
| F->>S: Scheduled pruning jobs | ||
| S->>S: Prune old files from CFS | ||
| S->>S: Prune old files from data733 | ||
| end | ||
| ``` | ||
|
|
||
|
|
||
| ### Data Infrastructure Workflows | ||
| ```mermaid | ||
| --- | ||
| config: | ||
| theme: neo | ||
| layout: elk | ||
| look: neo | ||
| --- | ||
| flowchart LR | ||
| subgraph s1["new_file_733 Flow"] | ||
| n20["data733"] | ||
| n21["NERSC CFS"] | ||
| n22@{ label: "SciCat<br style=\"--tw-scale-x:\">[Metadata Database]" } | ||
| end | ||
| subgraph s2["HPSS Transfer Flow"] | ||
| n38["NERSC CFS"] | ||
| n39["HPSS Tape Archive"] | ||
| n40["SciCat <br>[Metadata Database]"] | ||
| end | ||
| subgraph s3["HPC Analysis Flow"] | ||
| n41["data733"] | ||
| n42["HPC<br>Filesystem"] | ||
| n43["HPC<br>Compute"] | ||
| end | ||
| n23["data733"] -- File Watcher --> n24["Dispatcher<br>[Prefect Worker]"] | ||
| n25["Detector"] -- Raw Data --> n23 | ||
| n24 --> s1 & s2 & s3 | ||
| n20 -- Raw Data [Globus Transfer] --> n21 | ||
| n21 -- "<span style=color:>Metadata [SciCat Ingestion]</span>" --> n22 | ||
| n32["Scheduled Pruning <br>[Prefect Workers]"] --> n35["NERSC CFS"] & n34["data733"] | ||
| n38 -- Raw Data [SFAPI Slurm htar Transfer] --> n39 | ||
| n39 -- "<span style=color:>Metadata [SciCat Ingestion]</span>" --> n40 | ||
| s2 --> n32 | ||
| s3 --> n32 | ||
| s1 --> n32 | ||
| n41 -- Raw Data [Globus Transfer] --> n42 | ||
| n42 -- Raw Data --> n43 | ||
| n43 -- Scratch Data --> n42 | ||
| n42 -- Scratch Data [Globus Transfer] --> n41 | ||
| n20@{ shape: internal-storage} | ||
| n21@{ shape: disk} | ||
| n22@{ shape: db} | ||
| n38@{ shape: disk} | ||
| n39@{ shape: paper-tape} | ||
| n40@{ shape: db} | ||
| n41@{ shape: internal-storage} | ||
| n42@{ shape: disk} | ||
| n23@{ shape: internal-storage} | ||
| n24@{ shape: rect} | ||
| n25@{ shape: rounded} | ||
| n35@{ shape: disk} | ||
| n34@{ shape: internal-storage} | ||
| n20:::storage | ||
| n20:::Peach | ||
| n21:::Sky | ||
| n22:::Sky | ||
| n38:::Sky | ||
| n39:::storage | ||
| n40:::Sky | ||
| n41:::Peach | ||
| n42:::Sky | ||
| n43:::compute | ||
| n23:::collection | ||
| n23:::storage | ||
| n23:::Peach | ||
| n24:::collection | ||
| n24:::Rose | ||
| n25:::Ash | ||
| n32:::Rose | ||
| n35:::Sky | ||
| n34:::Peach | ||
| classDef collection fill:#D3A6A1, stroke:#D3A6A1, stroke-width:2px, color:#000000 | ||
| classDef Rose stroke-width:1px, stroke-dasharray:none, stroke:#FF5978, fill:#FFDFE5, color:#8E2236 | ||
| classDef storage fill:#A3C1DA, stroke:#A3C1DA, stroke-width:2px, color:#000000 | ||
| classDef Ash stroke-width:1px, stroke-dasharray:none, stroke:#999999, fill:#EEEEEE, color:#000000 | ||
| classDef visualization fill:#E8D5A6, stroke:#E8D5A6, stroke-width:2px, color:#000000 | ||
| classDef Peach stroke-width:1px, stroke-dasharray:none, stroke:#FBB35A, fill:#FFEFDB, color:#8F632D | ||
| classDef Sky stroke-width:1px, stroke-dasharray:none, stroke:#374D7C, fill:#E2EBFF, color:#374D7C | ||
| classDef compute fill:#A9C0C9, stroke:#A9C0C9, stroke-width:2px, color:#000000 | ||
| style s1 stroke:#757575 | ||
| style s2 stroke:#757575 | ||
| style s3 stroke:#757575 | ||
|
|
||
| ``` | ||
|
|
||
| ## Data at 7.3.3 | ||
|
|
||
| The data collected from 7.3.3 are typically 2D scattering images, where each pixel records scattering intensity as a function of scattering angle. | ||
|
|
||
| ## File Watcher | ||
|
|
||
| There is a file watcher on the system `data733` that listens for new scans that have finished writing to disk. From there, a Prefect Flow we call `dispatcher` kicks off the downstream steps: | ||
| - Copy scans in real time to `NERSC CFS` using Globus Transfer. | ||
| - Copy project data to `NERSC HPSS` for long-term storage. | ||
| - Analysis on HPC systems (TBD). | ||
| - Schedule data pruning from `data733` and `NERSC CFS`. | ||
|
|
||
| ## Prefect Configuration | ||
|
|
||
| ### Registered Flows | ||
|
|
||
| #### `dispatcher.py` | ||
|
|
||
| The Dispatcher Prefect Flow manages the logic for handling the order and execution of data tasks. As as soon as the File Watcher detects that a new file is written, it calls the `dispatcher()` Flow. In this case, the dispatcher handles the synchronous call to `move.py`, with a potential to add additional steps (e.g. scheduling remote HPC analysis code). | ||
|
|
||
| #### `move.py` | ||
|
|
||
| Flow to process a new file at BL 7.3.3 | ||
| 1. Copy the file from the data733 to NERSC CFS. Ingest file path in SciCat. | ||
| 2. Schedule pruning from data733 (ensuring that data is on NERSC before deletion). | ||
| 3. Copy the file from NERSC CFS to NERSC HPSS. Ingest file path in SciCat. | ||
| 4. Schedule pruning from NERSC CFS (ensuring data is on HPSS before deletion). | ||
|
|
||
| ### Prefect Server + Deployments | ||
|
|
||
| This beamline is starting fresh with `Prefect==3.4.2` (an upgrade from `2.19.5`). With the latest Prefect versions, we can define deployments in a `yaml` file rather than build/apply steps in a shell script. `create_deployments_733.sh` is the legacy way we support registering flows. Now, flows are defined in `orchestration/flows/bl733/prefect.yaml`. Keeping the prefect config for the beamline within the flows folder makes it easier to keep track of different Prefect deployments for different beamlines. | ||
|
|
||
| Note that we still must create work pools manually before we can register flows to them. | ||
|
|
||
| For example, here is how we can now create our deployments: | ||
|
|
||
| ```bash | ||
| # cd to the directory | ||
| cd orchestration/flows/bl733/ | ||
|
|
||
| # add Prefect API URL + Key to the environment (if not already present) | ||
| export PREFECT_API_URL=http://<your-prefect-server-for-bl733>:4200/api | ||
|
|
||
| # create the work-pools | ||
| prefect work-pool create new_file_733_pool | ||
| prefect work-pool create dispatcher_733_pool | ||
| prefect work-pool create prune_733_pool | ||
|
|
||
| prefect deploy | ||
| ``` | ||
|
|
||
| We can also preview a deployment: `prefect deploy --output yaml`, or deploy only one flow `prefect deploy --name run_733_dispatcher`. | ||
|
|
||
| The following script follows the above logic for deploying the flows in a streamlined fashion for the latest version of Prefect: | ||
| `splash_flows/init_work_pools.py` | ||
|
|
||
| ## VM Details | ||
|
|
||
| The computing backend runs on a VM in the B15 server room that is managed by ALS IT staff. | ||
|
|
||
| **Name**: `flow-733` | ||
| **OS**: `Ubuntu 24.04 LTS` | ||
|
|
||
| We are using Ansible to streamline the development and support of this virtual machine. See https://github.com/als-computing/als_ansible/pull/4 for details. | ||
|
|
||
|
|
||
| ## Data Access for Users | ||
|
|
||
| Users can download their data from SciCat, our metadata database, where we keep track of file location history, and additional experiment metadata. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.