diff --git a/docs/faq.rst b/docs/faq.rst index 7687785..8dce6bf 100644 --- a/docs/faq.rst +++ b/docs/faq.rst @@ -16,3 +16,15 @@ FAQ Please ensure TMPDIR has execution permissions or temporarily set TMPDIR to a directory that does:: TMPDIR=. pip install scallops + +#. Unable to download model for Stardist or other deep learning models + Some workflow environments (e.g. AWS HealthOmics) prohibit accesssing public websites. + You need to download model files to an accessible location, and set the workflow input, `model_dir` + to this location. + + Example:: + + wget https://github.com/stardist/stardist-models/releases/download/v0.1/python_2D_versatile_fluo.zip + aws s3 cp python_2D_versatile_fluo.zip s3://my-bucket/model/ + + In your OPS workflow input JSON, set `model_dir` to `s3://my-bucket/model/` diff --git a/docs/workflows.rst b/docs/workflows.rst index a106059..66b880e 100644 --- a/docs/workflows.rst +++ b/docs/workflows.rst @@ -24,9 +24,9 @@ Workflow Steps 3. **Stitching**: * Applies the calculated flatfield to the raw tiles. - * Corrects for radial distortion (can be disabled or `k` can be provided). + * Corrects for radial distortion. * Aligns tiles using stage positions and cross-correlation. - * Stitches tiles into a multi-scale OME-Zarr image. + * Stitches tiles into an OME-Zarr image. Inputs ------ @@ -45,13 +45,16 @@ These are the absolute minimum parameters required to run the stitching workflow - Description * - **urls** - Array[String] - - List of directories containing raw images (S3 or local paths). + - List of directories containing raw images (e.g S3 URLs). * - **image_pattern** - String - Regex-like pattern to parse filenames (e.g., ``"Well{well}_Point{point}.nd2"``). * - **output_directory** - String - Base path for outputs. + * - **docker** + - String + - Workflow docker image. .. code-block:: json :caption: Minimal Stitching JSON @@ -59,7 +62,8 @@ These are the absolute minimum parameters required to run the stitching workflow { "urls": ["s3://your-bucket/experiment_data/"], "image_pattern": "20231010_10x_6W_SBS_c{t}/plate{plate}/Well{well}_Point{skip}_{skip}_Channel{skip}_Seq{skip}.nd2", - "output_directory": "s3://your-bucket/experiment_data/stitch/iss/" + "output_directory": "s3://your-bucket/experiment_data/stitch/iss/", + "docker":"772311241819.dkr.ecr.us-west-2.amazonaws.com/scallops:1.0.0" } Full Parameter Reference (Advanced) @@ -118,7 +122,7 @@ Below is the complete list of exposed options, including optional settings for g - Force re-run of stitching even if output exists. * - **Resources** - Various - - ``stitch_cpu``, ``stitch_memory``, ``stitch_disks``, etc. can be set to override defaults. + - ``stitch_cpu``, ``stitch_memory``, etc. can be set to override defaults. Outputs ------- @@ -186,6 +190,9 @@ These are the absolute minimum parameters required to run the OPS workflow. * - **reads_labels** - String - Which segmentation label to assign reads to (e.g., ``"cell"`` or ``"nuclei"``). + * - **docker** + - String + - Workflow docker image. .. code-block:: json :caption: Minimal OPS JSON @@ -196,7 +203,8 @@ These are the absolute minimum parameters required to run the OPS workflow. "phenotype_url": "s3://your-bucket/experiment/stitch/pheno/stitch/stitch.zarr/", "phenotype_dapi_channel": 4, "phenotype_cyto_channel": [6], - "reads_labels": "cell" + "reads_labels": "cell", + "docker":"772311241819.dkr.ecr.us-west-2.amazonaws.com/scallops:1.0.0" } Full Parameter Reference (Advanced) @@ -225,9 +233,6 @@ Below is the complete list of exposed options covering registration, feature ext * - **subset** - Array[String] - Filter specific wells/plates. - * - **batch_size** - - Int - - Number of groups to process in one batch. **Segmentation & Registration** @@ -313,7 +318,7 @@ Below is the complete list of exposed options covering registration, feature ext - Array[Float] - Sigma for Laplacian of Gaussian spot detection. -**Control Flags (Force / Skip)** +**Additional Parameters** .. list-table:: :widths: 30 15 55 @@ -322,19 +327,21 @@ Below is the complete list of exposed options covering registration, feature ext * - Parameter - Type - Description - * - **run_spot_detect** - - Boolean - - Default ``true``. - * - **run_nuclei_segmentation** - - Boolean - - Default ``true``. - * - **run_cell_segmentation** + * - **model_dir** + - String + - Path containing deep learning model resouces (See :doc:`FAQ ` for more details.) + * - **run_``task``** - Boolean - - Default ``true``. - * - **force_merge** + - Set to ``false``, (e.g. run_nuclei_segmentation) to skip task + * - **force_``task``** - Boolean - - Force re-merge even if output exists. - + - Set to ``true``, to re-run task (e.g. force_segment_cell) even if output exists. + * - **Resources** + - Various + - ``segment_nuclei_cpu``, ``segment_nuclei_memory``, etc. can be set to override defaults. + * - **batch_size** + - Int + - Number of groups to process in one batch. Outputs ------- @@ -396,8 +403,7 @@ Create a JSON file (e.g., ``ops_input.json``) defining your inputs. Below is a m "segment_cell_threshold_correction_factor": 1.0, "cell_segmentation_extra_arguments": "--closing-radius 5", - "docker_registry": "123456789012.dkr.ecr.us-region-1.amazonaws.com", - "docker_version": "latest" + "docker": "123456789012.dkr.ecr.us-region-1.amazonaws.com/scallops:latest" } Step 2: Run with miniwdl-omics-run @@ -451,7 +457,7 @@ Suppose you only need to perform image registration without the full segmentatio String moving_image String fixed_image String output_dir - String docker_registry + String docker } # Call the existing Scallops registration task @@ -462,7 +468,7 @@ Suppose you only need to perform image registration without the full segmentatio transform_output_directory = output_dir + "/transforms", moving_output_directory = output_dir + "/registered_images", # Pass through required runtime parameters - docker = docker_registry + "/scallops:latest", + docker = docker, cpu = 4, memory = "16 GiB", # ... (other required inputs like zones, disks, etc.) diff --git a/scallops/tests/test_wdl.py b/scallops/tests/test_wdl.py index c7a3c7a..64e4279 100644 --- a/scallops/tests/test_wdl.py +++ b/scallops/tests/test_wdl.py @@ -55,6 +55,7 @@ def test_stitch_wdl_z_stack(tmp_path): "z_index": "focus", "stitch_radial_correction_k": "none", "output_directory": str(tmp_path / "out"), + "docker": "", } with open(tmp_path / "inputs.json", "wt") as out: @@ -99,6 +100,7 @@ def test_stitch_wdl(tmp_path): "stitch_workflow.urls": [str(input_path)], "stitch_workflow.image_pattern": "{well}-{skip}.zarr", "stitch_workflow.output_directory": str(tmp_path / "out"), + "stitch_workflow.docker": "", } with open(tmp_path / "inputs.json", "wt") as out: @@ -176,6 +178,7 @@ def test_ops_wdl(tmp_path): "ops_workflow.mark_stitch_boundary_cells": False, "ops_workflow.reads_labels": "cell", "ops_workflow.merge_extra_arguments": "--format parquet", + "ops_workflow.docker": "", } with open(tmp_path / "inputs.json", "wt") as out: diff --git a/wdl/ops_workflow.wdl b/wdl/ops_workflow.wdl index 294fd62..576cf33 100644 --- a/wdl/ops_workflow.wdl +++ b/wdl/ops_workflow.wdl @@ -64,7 +64,7 @@ workflow ops_workflow { String? reads_threshold_peaks_crosstalk String? reads_extra_arguments - String model_dir = "s3://bigdipir-ctg-s3/models/" + String model_dir = "" # nuclei segment String? nuclei_segmentation @@ -151,11 +151,11 @@ workflow ops_workflow { String cell_intersects_boundary_disks = "local-disk 200 HDD" - String docker = "563221710766.dkr.ecr.us-west-2.amazonaws.com/external/ctg/scallops:latest" + String docker Int preemptible = 0 String zones = "us-west1-a us-west1-b us-west1-c" - String aws_queue_arn = "arn:aws:batch:us-west-2:752311211819:job-queue/gred" + String aws_queue_arn = "" Int max_retries = 0 String segment_suffix = "segment.zarr" diff --git a/wdl/stitch_workflow.wdl b/wdl/stitch_workflow.wdl index 2a6710f..79a7e31 100644 --- a/wdl/stitch_workflow.wdl +++ b/wdl/stitch_workflow.wdl @@ -42,11 +42,11 @@ workflow stitch_workflow { Boolean? force_illumination_correction Boolean? force_stitch - String docker = "563221710766.dkr.ecr.us-west-2.amazonaws.com/external/ctg/scallops:latest" + String docker Int preemptible = 0 String zones = "us-west1-a us-west1-b us-west1-c" - String aws_queue_arn = "arn:aws:batch:us-west-2:752311211819:job-queue/gred" + String aws_queue_arn = "" Int max_retries = 0 }