Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
name: Build and Publish to PyPI

on:
push:
tags:
- 'v*' # Triggers on tags like v0.0.1, v1.0.0

jobs:
build_wheels:
name: Build wheels on ${{ matrix.os }}
runs-on: ${{ matrix.os }}
strategy:
matrix:
# Mac (Intel & Silicon), Windows, and Linux
os: [ubuntu-latest, windows-latest, macos-13, macos-14]

steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Required for setuptools_scm versioning

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install cibuildwheel
run: python -m pip install cibuildwheel==2.22.0

# This prints the plan to the logs before building
- name: List intended builds
run: python -m cibuildwheel --print-build-identifiers
env:
CIBW_PROJECT_REQUIRES_PYTHON: ">=3.11"
CIBW_SKIP: "pp* *-win32 *-manylinux_i686"

- name: Build wheels
run: python -m cibuildwheel --output-dir wheelhouse
env:
# Reads pyproject.toml to skip Python < 3.11 automatically
CIBW_PROJECT_REQUIRES_PYTHON: ">=3.11"
# Skip PyPy (pp*) and 32-bit architectures to speed up build
CIBW_SKIP: "pp* *-win32 *-manylinux_i686"

- uses: actions/upload-artifact@v4
with:
name: cibw-wheels-${{ matrix.os }}-${{ strategy.job-index }}
path: ./wheelhouse/*.whl

build_sdist:
name: Build Source Distribution
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Build sdist
run: pipx run build --sdist

- uses: actions/upload-artifact@v4
with:
name: cibw-sdist
path: dist/*.tar.gz

publish_to_pypi:
needs: [build_wheels, build_sdist]
runs-on: ubuntu-latest
# Only run this job if the tag starts with 'v' (redundant safety check)
if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags/v')
steps:
- uses: actions/download-artifact@v4
with:
pattern: cibw-*
path: dist
merge-multiple: true

- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
10 changes: 10 additions & 0 deletions Manifest.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# MANIFEST.in
prune scallops/tests
prune tests
prune docs/notebooks
prune docs/_static/uncorrected.png
prune docs/_static/corrected.png
prune docs/_static/flatfield.png
prune docs/_static/css
global-exclude .git*
global-exclude .DS_Store
82 changes: 51 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
<h1 align="center">
<img src="docs/_static/scallopsLogo.png" width="150" alt="logo">
<img src="https://raw.githubusercontent.com/Genentech/scallops/62db13112dee13dc228bd2a458ada5a7a973d2bf/docs/_static/scallopsLogo.png" width="150" alt="logo">
</h1><br>

# SCALLOPS

[![PyPI version](https://badge.fury.io/py/scallops.svg)](https://badge.fury.io/py/scallops)
[![Python Versions](https://img.shields.io/pypi/pyversions/scallops.svg)](https://pypi.org/project/scallops/)
[![License](https://img.shields.io/pypi/l/scallops.svg)](https://raw.githubusercontent.com/Genentech/scallops/refs/heads/main/LICENSE)

## Description
SCALLOPS (Scalable Library for Optical Pooled Screens) is a comprehensive Python package designed to
streamline and scale the analysis of Optical Pooled Screens (OPS) for biological data. With a focus on
Expand All @@ -12,25 +16,42 @@ analyzing, and interpreting OPS data, leveraging modern distributed computing fr

## Installation

### Option 1: Install from PyPI (Recommended)
For most users, the easiest way to install SCALLOPS is via pip. This will install the pre-compiled binary wheels for your operating system (Linux, Windows, or macOS).

```bash
pip install scallops

```

*Note: SCALLOPS requires Python 3.11 or newer.*

### Option 2: Install from Source (For Development)

If you wish to contribute to the codebase or need the latest unreleased changes:

1. Clone the repository and change to the scallops directory:
```bash
git clone [https://github.com/Genentech/scallops.git](https://github.com/Genentech/scallops.git)
cd scallops

```

```
git clone https://github.com/Genentech/scallops.git
cd scallops
```

1. Install SCALLOPS:
2. Install SCALLOPS in editable mode with dependencies:
```bash
pip install -r requirements.txt -e .

```pip install -r requirements.txt -e .```
```



## Main Focus Areas:

- **High-Throughput Data Processing**: SCALLOPS is built to manage massive datasets typical of OPS
experiments, allowing users to efficiently process and analyze data across multiple scales.
- **Scalability and Performance**: The package is optimized for both local and cloud-based distributed
environments, making it ideal for scaling to large datasets without compromising performance.
* **High-Throughput Data Processing**: SCALLOPS is built to manage massive datasets typical of OPS
experiments, allowing users to efficiently process and analyze data across multiple scales.
* **Scalability and Performance**: The package is optimized for both local and cloud-based distributed
environments, making it ideal for scaling to large datasets without compromising performance.

### Modular Workflows:

Expand All @@ -39,30 +60,29 @@ their specific experimental needs.

## Key Features:

- **Efficient Data Handling**: SCALLOPS utilizes advanced memory management and lazy evaluation
techniques, which minimize resource usage while handling large datasets.
- **Command-Line Interface (CLI)**: Automates batch processing and simplifies integration into larger
pipelines.
- **Customizable Outputs**: The package generates versatile outputs, including data visualizations and
summary statistics, which can be integrated into downstream analyses.
- **Notebook Examples**: SCALLOPS includes practical Jupyter notebooks that walk users through typical
workflows, making it easy to get started with real-world datasets.
- **Custom Features**: Advanced users can extend SCALLOPS with their own custom functions and workflows,
ensuring the package can grow with the complexity of the data.
- **Comprehensive API**: SCALLOPS provides a rich API that exposes all the package functionalities,
allowing users to integrate it directly into their own Python scripts and workflows. This makes
SCALLOPS highly adaptable, enabling users to build fully customized data pipelines and analyses
tailored to their unique experimental needs.
* **Efficient Data Handling**: SCALLOPS utilizes advanced memory management and lazy evaluation
techniques, which minimize resource usage while handling large datasets.
* **Command-Line Interface (CLI)**: Automates batch processing and simplifies integration into larger
pipelines.
* **Customizable Outputs**: The package generates versatile outputs, including data visualizations and
summary statistics, which can be integrated into downstream analyses.
* **Notebook Examples**: SCALLOPS includes practical Jupyter notebooks that walk users through typical
workflows, making it easy to get started with real-world datasets.
* **Custom Features**: Advanced users can extend SCALLOPS with their own custom functions and workflows,
ensuring the package can grow with the complexity of the data.
* **Comprehensive API**: SCALLOPS provides a rich API that exposes all the package functionalities,
allowing users to integrate it directly into their own Python scripts and workflows. This makes
SCALLOPS highly adaptable, enabling users to build fully customized data pipelines and analyses
tailored to their unique experimental needs.

## Typical Use Cases:

- **Large-Scale Screening Projects**: SCALLOPS is designed for handling the immense data loads of
genome-wide OPS projects, helping users efficiently identify and quantify biological perturbations.
- **Data-Driven Insights**: SCALLOPS facilitates the discovery of patterns and trends in OPS data,
helping users extract and interpret complex biological systems' data.


* **Large-Scale Screening Projects**: SCALLOPS is designed for handling the immense data loads of
genome-wide OPS projects, helping users efficiently identify and quantify biological perturbations.
* **Data-Driven Insights**: SCALLOPS facilitates the discovery of patterns and trends in OPS data,
helping users extract and interpret complex biological systems' data.

## Contributing to SCALLOPS

We welcome all forms of contributions, including bug reports, documentation improvements, and feature
enhancements.
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,9 @@ authors = [
dynamic = ["version"]
readme = { file = "README.md", content-type = "text/markdown" }
requires-python = ">=3.11"
license = { file = "LICENSE" }
classifiers = [# https://pypi.python.org/pypi?%3Aaction=list_classifiers
"License :: OSI Approved :: BSD License",
"License :: OSI Approved :: Apache Software License",
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
"Natural Language :: English",
Expand Down