Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
5b455c5
Init commit for 7.3.3 workflows. Created a basic move.py script that …
davramov Feb 7, 2025
2d7ff3a
Fixed comment formatting in pytest
davramov Feb 7, 2025
de3e974
Updating documentation
davramov Feb 18, 2025
8eedaca
Updating documentation and added a dispatcher for bl733
davramov Feb 18, 2025
747c412
Continuing to develop the 733 flows. Updated docs, added comments ind…
davramov Feb 18, 2025
ec968aa
Updating documentation
davramov Jul 7, 2025
3fb6361
Adding globus prune_controller logic from PR#59 to move.py for now. O…
davramov Jul 7, 2025
8c2e38b
Removing 'self' as input parameter to prune() since it is not part of…
davramov Jul 7, 2025
c195c60
Adding a mocker.patch for schedule_prefect_flow in test_globus_flow.…
davramov Jul 7, 2025
ddbecbf
Updating prefect deployments... although I should make sure this work…
davramov Jul 7, 2025
c86522f
Created a new file 'create_deployments_733_3.4.2.sh' for deploying Pr…
davramov Jul 7, 2025
fbecede
Making sure the correct function is registered for pruning. Making su…
davramov Jul 8, 2025
707409c
Adding file name to the dispatcher flow_run_name
davramov Jul 8, 2025
d85c163
Adding UUID for ALS Transfer 733 endpoint 💾
davramov Aug 18, 2025
b60869f
fixing work_pool yaml entries in orchestration/flows/bl733/prefect.yaml
davramov Aug 18, 2025
1899983
Fixing flow run name formatting
davramov Aug 19, 2025
d5f1a13
Adding BEAMLINE env variable to login script
davramov Aug 20, 2025
fcaae9d
adjusting main method file_path to test
davramov Aug 20, 2025
1897e33
Commenting out the delete block to ensure no data is accidentally del…
davramov Aug 20, 2025
27a06cf
Adjusting the instructions in the docsctring for the correct usage of…
davramov Aug 20, 2025
124354f
Adding the init_work_pools.sh script from the prefect 3 PR to help wi…
davramov Aug 20, 2025
9a8f641
Commenting out Prefect JSON block stuff, since it is becoming depreca…
davramov Aug 20, 2025
8e9ee98
Adjusting docstring to indicate days_from_now expects a float, not a …
davramov Aug 26, 2025
70c4fcb
Rewriting the init_work_pools script to be in python rather than shel…
davramov Aug 26, 2025
7fc4f13
Moving old flow registering scripts to scripts/legacy/
davramov Aug 26, 2025
683a315
Moving old flow registering scripts to scripts/legacy/
davramov Aug 26, 2025
454f34e
Updating init_work_pools.py to set the GLOBUS_CLIENT_ID and GLOBUS_CL…
davramov Aug 26, 2025
398bfe9
Adding new_file_733_flight_check flow (test Globus transfer from data…
davramov Aug 26, 2025
09b2341
Adding a todo item for updating the config type checking once PR#62 i…
davramov Aug 26, 2025
3749ad4
updating documentation
davramov Sep 9, 2025
b6fb8d5
pytest patches for prefect 3
davramov Sep 9, 2025
ad84b40
Updating diagram to fit vertically instead of horizontally
davramov Sep 9, 2025
a98e139
Updating diagram to fit vertically instead of horizontally
davramov Sep 9, 2025
4b1c05c
fixing cut off text in diagram
davramov Sep 9, 2025
a3b8300
Updating globus endpoint names in config.yaml to use beamline name (b…
davramov Oct 22, 2025
4a14e3c
removing bl733 from the test_globus_flow.py pytest
davramov Nov 7, 2025
51ed315
moving pytests to a specific test folder for bl733
davramov Nov 7, 2025
f361d4d
making the move flow call the move task, making dispatcher call the m…
davramov Nov 7, 2025
abeba68
Updating dockerfile to pull prefect 3.4.2
davramov Nov 7, 2025
66cad5d
Making sure logger = get_run_logger() within the functions for better…
davramov Nov 11, 2025
67c033e
Adding flow_run_name to schedule_prefect_flow call in the prune method
davramov Nov 13, 2025
44fbe4c
Adding Lamarr (ALS Computing Beamlines/Global) as a transfer endpoint…
davramov Nov 24, 2025
9002ba0
Removing legacy create_deployments scripts
davramov Dec 17, 2025
75b0e14
Fixing config.yml after rebase
davramov Dec 17, 2025
f1b8fdb
removing old move_733.py script
davramov Dec 17, 2025
479e518
using Variable Blocks for settings
davramov Dec 17, 2025
84fa884
removing delete_spot733 option for bl733-settings prefect variable (n…
davramov Dec 17, 2025
5b27406
Loading max_wait_settings from Prefect Variable block
davramov Dec 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM prefecthq/prefect:3.4.2-python3.11
FROM prefecthq/prefect:3.4.2-python3.13

WORKDIR /app
COPY ./requirements.txt /tmp/
Expand Down
67 changes: 55 additions & 12 deletions config.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,60 @@
globus:
globus_endpoints:

# 7.0.1.2 ENDPOINTS

nersc7012:
root_path: /global/cfs/cdirs/als/gsharing/data_mover/7012
uri: nersc.gov
uuid: d40248e6-d874-4f7b-badd-2c06c16f1a58
name: nersc7012

data7012:
root_path: /
uri: hpc.lbl.gov
uuid: 741b96e1-1b98-42a8-918d-daacc24c145f
name: data7012

# 7.3.3 ENDPOINTS

bl733-nersc-alsdev:
root_path: /global/cfs/cdirs/als/data_mover/7.3.3/
uri: nersc.gov
uuid: d40248e6-d874-4f7b-badd-2c06c16f1a58
name: bl733-nersc-alsdev

bl733-nersc-alsdev_raw:
root_path: /global/cfs/cdirs/als/data_mover/7.3.3/raw
uri: nersc.gov
uuid: d40248e6-d874-4f7b-badd-2c06c16f1a58
name: bl733-nersc-alsdev_raw

bl733-als-data733:
root_path: /
uri: data733.lbl.gov
uuid: 26b3d2cf-fd80-4a64-a78f-38a155aca926
name: bl733-als-data733

bl733-als-data733_raw:
root_path: /
uri: data733.lbl.gov
uuid: 26b3d2cf-fd80-4a64-a78f-38a155aca926
name: data733_raw

bl733-lamarr-global:
root_path: /bl733/
uri: lamarr.als.lbl.gov
uuid: dbaac176-b1f7-4134-979a-0b1668786d11
name: bl733-lamarr-global

bl733-lamarr-beamlines:
root_path: /bl733/
uri: lamarr.als.lbl.gov
uuid: aee983fc-826e-4081-bfb2-62529970540d
name: bl733-lamarr-beamlines

# 8.3.2 ENDPOINTS

spot832:
root_path: /
uri: spot832.lbl.gov
Expand Down Expand Up @@ -90,18 +145,6 @@ globus:
uuid: df82346e-9a15-11ea-b3c4-0ae144191ee3
name: nersc832

nersc7012:
root_path: /global/cfs/cdirs/als/gsharing/data_mover/7012
uri: nersc.gov
uuid: d40248e6-d874-4f7b-badd-2c06c16f1a58
name: nersc7012

data7012:
root_path: /
uri: hpc.lbl.gov
uuid: 741b96e1-1b98-42a8-918d-daacc24c145f
name: data7012

globus_apps:
als_transfer:
client_id: ${GLOBUS_CLIENT_ID}
Expand Down
207 changes: 207 additions & 0 deletions docs/mkdocs/docs/bl733.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
# Beamline 7.3.3 Flows

This page documents the workflows supported by Splash Flows Globus at [ALS Beamline 7.3.3 (SAXS/WAXS/GISAXS)](https://saxswaxs.lbl.gov/user-information). Beamline 7.3.3 supports hard x-ray scattering techniques include small- and wide-angle x-ray scattering (SAXS/WAXS) and grazing-incidence SAXS/WAXS (GISAXS/GIWAXS).

## Diagrams

### Sequence Diagram
```mermaid
sequenceDiagram
participant T as Trigger<br/>Components
participant F as Prefect<br/>Flows
participant S as Storage &<br/>Processing

%% Initial Trigger
T->>T: Detector → File Watcher
T->>F: File Watcher triggers Dispatcher
F->>F: Dispatcher coordinates downstream Flows

%% Flow 1: new_file_733
rect rgb(220, 230, 255)
note over F,S: FLOW 1: new_file_733
F->>S: Access data733
S->>S: Globus Transfer to NERSC CFS
S->>S: Ingest metadata to SciCat
end

%% Flow 2: HPSS Transfer
rect rgb(220, 255, 230)
note over F,S: FLOW 2: Scheduled HPSS Transfer
F->>S: Access NERSC CFS
S->>S: SFAPI Transfer to HPSS Tape
S->>S: Ingest metadata to SciCat
end

%% Flow 3: HPC Analysis
rect rgb(255, 230, 230)
note over F,S: FLOW 3: HPC Downstream Analysis
F->>S: Access data733
S->>S: Globus Transfer to HPC
S->>S: Run HPC Compute Processing
S->>S: Return scratch data to data733
end

%% Flow 4: Scheduled Pruning
rect rgb(255, 255, 220)
note over F,S: FLOW 4: Scheduled Pruning
F->>S: Scheduled pruning jobs
S->>S: Prune old files from CFS
S->>S: Prune old files from data733
end
```


### Data Infrastructure Workflows
```mermaid
---
config:
theme: neo
layout: elk
look: neo
---
flowchart LR
subgraph s1["new_file_733 Flow"]
n20["data733"]
n21["NERSC CFS"]
n22@{ label: "SciCat<br style=\"--tw-scale-x:\">[Metadata Database]" }
end
subgraph s2["HPSS Transfer Flow"]
n38["NERSC CFS"]
n39["HPSS Tape Archive"]
n40["SciCat <br>[Metadata Database]"]
end
subgraph s3["HPC Analysis Flow"]
n41["data733"]
n42["HPC<br>Filesystem"]
n43["HPC<br>Compute"]
end
n23["data733"] -- File Watcher --> n24["Dispatcher<br>[Prefect Worker]"]
n25["Detector"] -- Raw Data --> n23
n24 --> s1 & s2 & s3
n20 -- Raw Data [Globus Transfer] --> n21
n21 -- "<span style=color:>Metadata [SciCat Ingestion]</span>" --> n22
n32["Scheduled Pruning <br>[Prefect Workers]"] --> n35["NERSC CFS"] & n34["data733"]
n38 -- Raw Data [SFAPI Slurm htar Transfer] --> n39
n39 -- "<span style=color:>Metadata [SciCat Ingestion]</span>" --> n40
s2 --> n32
s3 --> n32
s1 --> n32
n41 -- Raw Data [Globus Transfer] --> n42
n42 -- Raw Data --> n43
n43 -- Scratch Data --> n42
n42 -- Scratch Data [Globus Transfer] --> n41
n20@{ shape: internal-storage}
n21@{ shape: disk}
n22@{ shape: db}
n38@{ shape: disk}
n39@{ shape: paper-tape}
n40@{ shape: db}
n41@{ shape: internal-storage}
n42@{ shape: disk}
n23@{ shape: internal-storage}
n24@{ shape: rect}
n25@{ shape: rounded}
n35@{ shape: disk}
n34@{ shape: internal-storage}
n20:::storage
n20:::Peach
n21:::Sky
n22:::Sky
n38:::Sky
n39:::storage
n40:::Sky
n41:::Peach
n42:::Sky
n43:::compute
n23:::collection
n23:::storage
n23:::Peach
n24:::collection
n24:::Rose
n25:::Ash
n32:::Rose
n35:::Sky
n34:::Peach
classDef collection fill:#D3A6A1, stroke:#D3A6A1, stroke-width:2px, color:#000000
classDef Rose stroke-width:1px, stroke-dasharray:none, stroke:#FF5978, fill:#FFDFE5, color:#8E2236
classDef storage fill:#A3C1DA, stroke:#A3C1DA, stroke-width:2px, color:#000000
classDef Ash stroke-width:1px, stroke-dasharray:none, stroke:#999999, fill:#EEEEEE, color:#000000
classDef visualization fill:#E8D5A6, stroke:#E8D5A6, stroke-width:2px, color:#000000
classDef Peach stroke-width:1px, stroke-dasharray:none, stroke:#FBB35A, fill:#FFEFDB, color:#8F632D
classDef Sky stroke-width:1px, stroke-dasharray:none, stroke:#374D7C, fill:#E2EBFF, color:#374D7C
classDef compute fill:#A9C0C9, stroke:#A9C0C9, stroke-width:2px, color:#000000
style s1 stroke:#757575
style s2 stroke:#757575
style s3 stroke:#757575

```

## Data at 7.3.3

The data collected from 7.3.3 are typically 2D scattering images, where each pixel records scattering intensity as a function of scattering angle.

## File Watcher

There is a file watcher on the system `data733` that listens for new scans that have finished writing to disk. From there, a Prefect Flow we call `dispatcher` kicks off the downstream steps:
- Copy scans in real time to `NERSC CFS` using Globus Transfer.
- Copy project data to `NERSC HPSS` for long-term storage.
- Analysis on HPC systems (TBD).
- Schedule data pruning from `data733` and `NERSC CFS`.

## Prefect Configuration

### Registered Flows

#### `dispatcher.py`

The Dispatcher Prefect Flow manages the logic for handling the order and execution of data tasks. As as soon as the File Watcher detects that a new file is written, it calls the `dispatcher()` Flow. In this case, the dispatcher handles the synchronous call to `move.py`, with a potential to add additional steps (e.g. scheduling remote HPC analysis code).

#### `move.py`

Flow to process a new file at BL 7.3.3
1. Copy the file from the data733 to NERSC CFS. Ingest file path in SciCat.
2. Schedule pruning from data733 (ensuring that data is on NERSC before deletion).
3. Copy the file from NERSC CFS to NERSC HPSS. Ingest file path in SciCat.
4. Schedule pruning from NERSC CFS (ensuring data is on HPSS before deletion).

### Prefect Server + Deployments

This beamline is starting fresh with `Prefect==3.4.2` (an upgrade from `2.19.5`). With the latest Prefect versions, we can define deployments in a `yaml` file rather than build/apply steps in a shell script. `create_deployments_733.sh` is the legacy way we support registering flows. Now, flows are defined in `orchestration/flows/bl733/prefect.yaml`. Keeping the prefect config for the beamline within the flows folder makes it easier to keep track of different Prefect deployments for different beamlines.

Note that we still must create work pools manually before we can register flows to them.

For example, here is how we can now create our deployments:

```bash
# cd to the directory
cd orchestration/flows/bl733/

# add Prefect API URL + Key to the environment (if not already present)
export PREFECT_API_URL=http://<your-prefect-server-for-bl733>:4200/api

# create the work-pools
prefect work-pool create new_file_733_pool
prefect work-pool create dispatcher_733_pool
prefect work-pool create prune_733_pool

prefect deploy
```

We can also preview a deployment: `prefect deploy --output yaml`, or deploy only one flow `prefect deploy --name run_733_dispatcher`.

The following script follows the above logic for deploying the flows in a streamlined fashion for the latest version of Prefect:
`splash_flows/init_work_pools.py`

## VM Details

The computing backend runs on a VM in the B15 server room that is managed by ALS IT staff.

**Name**: `flow-733`
**OS**: `Ubuntu 24.04 LTS`

We are using Ansible to streamline the development and support of this virtual machine. See https://github.com/als-computing/als_ansible/pull/4 for details.


## Data Access for Users

Users can download their data from SciCat, our metadata database, where we keep track of file location history, and additional experiment metadata.
6 changes: 4 additions & 2 deletions docs/mkdocs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,10 @@ nav:
- Home: index.md
- Installation and Requirements: install.md
- Getting Started: getting_started.md
- Compute at ALCF: alcf832.md
- Compute at NERSC: nersc832.md
- Beamline Implementations:
- 7.3.3 SAXS/WAXS/GISAXS: bl733.md
- 8.3.2 Micro Tomography - Compute at ALCF: alcf832.md
- 8.3.2 Micro Tomography - Compute at NERSC: nersc832.md
- Orchestration: orchestration.md
- Configuration: configuration.md
# - Troubleshooting: troubleshooting.md
Expand Down
1 change: 1 addition & 0 deletions init_work_pools.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
- Deploys all flows defined in the beamline's prefect.yaml.
- Creates/updates Prefect Secret blocks for GLOBUS_CLIENT_ID and GLOBUS_CLIENT_SECRET
if the corresponding environment variables are present. Otherwise warns and continues.

Environment Variables:
BEAMLINE The beamline identifier (e.g., 832). Required.
PREFECT_API_URL Override the Prefect server API URL.
Expand Down
11 changes: 8 additions & 3 deletions init_work_pools.sh
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ set -euo pipefail
BEAMLINE="${BEAMLINE:?Must set BEAMLINE (e.g. 832, 733)}"

# Path to the Prefect project file
PREFECT_YAML="/splash_flows/orchestration/flows/bl${BEAMLINE}/prefect.yaml"
PREFECT_YAML="orchestration/flows/bl${BEAMLINE}/prefect.yaml"

if [[ ! -f "$PREFECT_YAML" ]]; then
echo "[Init:${BEAMLINE}] ERROR: Expected $PREFECT_YAML not found!" >&2
Expand All @@ -67,10 +67,15 @@ echo "[Init:${BEAMLINE}] Waiting for Prefect server at $PREFECT_API_URL..."
python3 - <<EOF
import os, time, sys
import httpx
beamline = "$BEAMLINE"

api_url = os.environ.get("PREFECT_API_URL", "http://prefect_server:4200/api")
health_url = f"{api_url}/health"
beamline = "$BEAMLINE"
if "api.prefect.cloud" in api_url:
print(f"[Init:{beamline}] Prefect Cloud detected — skipping health check.")
sys.exit(0)
else:
health_url = f"{api_url}/health"
print(f"[Init:{beamline}] Ping self-hosted health endpoint: {health_url}")

for _ in range(60): # try for ~3 minutes
try:
Expand Down
Empty file.
Loading