Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
3b3dd46
feat: add TODO for fk steering implementation in heun_denoiser
ludwigwinkler Jul 21, 2025
3069e93
Merge branch 'main' into luwinkler/fk_steering
ludwigwinkler Jul 22, 2025
6829f31
gitignore test outputs and test case for steering
ludwigwinkler Jul 22, 2025
ec98369
Merge branch 'luwinkler/fk_steering' of os.github.com:microsoft/bioem…
ludwigwinkler Jul 22, 2025
f61a0aa
first prototype
ludwigwinkler Jul 25, 2025
80a1de2
feat: implement Kabsch alignment and enhance steering functionality w…
ludwigwinkler Jul 28, 2025
046aeaf
feat: add wandb to .gitignore to exclude Weights and Biases files
ludwigwinkler Aug 3, 2025
bf583db
Refactor steering module and add new steering run script
ludwigwinkler Aug 19, 2025
7cfaa08
Add BioEMU development rules and enhance load_md script
ludwigwinkler Aug 22, 2025
c8a737b
Enhance BioEMU documentation and introduce new analytical diffusion m…
ludwigwinkler Aug 29, 2025
135b6ef
Update .gitignore to exclude fasta files in notebooks directory
Aug 29, 2025
aea00cd
Update .gitignore to exclude output files from notebooks directory
Aug 29, 2025
af1bcc0
Refactor tqdm imports in analytical_diffusion, run_steering_compariso…
Aug 29, 2025
0d3dd81
Remove .cursor/ from tracking and add to .gitignore
Sep 1, 2025
206860e
Delete .cursor/rules directory
ludwigwinkler Sep 1, 2025
f0fee56
Update .gitignore to exclude all files in .cursor directory
Sep 1, 2025
fc5093d
Merge branch 'luwinkler/fk_steering' of github.com:microsoft/bioemu i…
Sep 1, 2025
4cc3af2
Enhance steering functionality and introduce new potentials
Sep 1, 2025
5452976
first workin prototype of fast_steering
Sep 1, 2025
baeb075
Enhance steering capabilities and update documentation
Sep 3, 2025
40ae947
Refine steering documentation and enhance configuration handling
Sep 3, 2025
a42c5da
Update README to refine steering section and enhance CLI usage instru…
Sep 3, 2025
691dea8
Update README to include Python API example for steering and remove H…
Sep 3, 2025
516046d
Update .gitignore to exclude documentation files
Sep 10, 2025
917c47a
Refactor steering configuration and enhance disulfide bridge potential
Sep 30, 2025
4f92d08
Add guidance steering capabilities and update experiment scripts
Sep 30, 2025
a0bdd14
Enhance guidance steering implementation and update experiment config…
Sep 30, 2025
916f1ba
Enhance guidance steering comparison and update configurations
Oct 1, 2025
eb9fec9
Modify final resampling step in denoiser.py
YuuuXie Oct 17, 2025
a12ffdd
fix x0 type
YuuuXie Oct 20, 2025
389542e
Add physicality steering comparison notebook and update configurations
Nov 4, 2025
467db4a
Merge branch 'luwinkler/cli_steering' of github.com:microsoft/bioemu …
Nov 4, 2025
5bfe137
Update README.md
ludwigwinkler Nov 4, 2025
3072756
Update README.md
ludwigwinkler Nov 4, 2025
9e6c71b
Update README.md
ludwigwinkler Nov 4, 2025
65f44a0
refs for readme on steering and smc
Nov 4, 2025
4c09554
Merge branch 'luwinkler/cli_steering' of github.com:microsoft/bioemu …
Nov 4, 2025
b379b2e
Update notebooks/README_hydra_run.md
ludwigwinkler Nov 4, 2025
ac46972
some renaming files and typos
Nov 4, 2025
da161c3
Merge branch 'luwinkler/cli_steering' of github.com:microsoft/bioemu …
Nov 4, 2025
7aaa635
Remove unused guidance images: deleted `guidance_steering_comparison.…
Nov 4, 2025
9a2a405
Remove profiler module: deleted `src/bioemu/profiler.py` as it is no …
Nov 4, 2025
25b5bfd
Remove unused steering scratch pad: deleted `src/steering_scratch_pad…
Nov 4, 2025
2c97e05
Remove unused denoiser configuration: deleted `src/bioemu/config/deno…
Nov 4, 2025
da475bf
Remove unused guidance steering configuration: deleted `src/bioemu/co…
Nov 4, 2025
16c2da8
Merge branch 'main' into luwinkler/cli_steering
ludwigwinkler Nov 4, 2025
741d69d
Update .gitignore to exclude additional files, enhance pyproject.toml…
Dec 3, 2025
e28c475
simplified steering logic and first prototype for integrated sampling…
Dec 3, 2025
4d6455e
cleaned up notebooks and configs
Dec 8, 2025
60e92ae
Refactor steering configuration and update README for clarity. Remove…
Dec 16, 2025
46161fd
Remove commented TODO regarding embedding file copying in get_embeds.…
Dec 16, 2025
480b405
fix formatting
Dec 16, 2025
9252860
Fix save_pdb_and_xtc function to use the first element of pos_angstro…
Dec 16, 2025
885b4f2
fixing small deviations in the code to reduce mental load of the revi…
Dec 16, 2025
479d84b
fixing diverging code
Dec 16, 2025
0ac2b42
Update steering parameters in README.md to reflect new default values…
Dec 16, 2025
3312cd4
Refactor steering parameters and update configuration for consistency…
Jan 12, 2026
09f0850
Add disulfide bridge steering potential and example scripts for compa…
Jan 12, 2026
7b34735
Merge branch 'main' into luwinkler/cli_steering
ludwigwinkler Jan 12, 2026
cc20170
Add type hints to log_physicality and potential_loss_fn functions for…
Jan 12, 2026
6095451
Refactor logging in log_physicality function and add logger initializ…
Jan 12, 2026
1923313
ReadMe update and typo fix
Jan 12, 2026
9910f82
reverted back to no return sample() function
Jan 12, 2026
1c48ebf
Remove sample result validation from steering tests for cleaner output
Jan 12, 2026
e2360dd
Update steering section in README to clarify the use of steering part…
Jan 12, 2026
5aba79a
black formatter pre-commit hook
Jan 20, 2026
99fe49b
small ReadMe update
Jan 20, 2026
ccaa477
Add integration tests for CLI functionality in BioEMU
Jan 20, 2026
08967a0
Refactor disulfide bridge steering example to encapsulate main logic …
Jan 20, 2026
0e5523c
Refactor denoiser and steering potentials to streamline input paramet…
Jan 20, 2026
dd5d1f7
Merge branch 'main' into luwinkler/cli_steering
ludwigwinkler Jan 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
#vs code
.vscode
.vscode/
.github/copilot-instructions.md
outputs/

# Jetbrains
.idea
Expand Down Expand Up @@ -132,3 +134,14 @@ cython_debug/
*.pth
*.pt

# wandb & various
wandb
wandb/
.amltconfig
*/wandb/*
*.npz
*.pkl
*amlt*
*outputs*
*cache*

51 changes: 51 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ This repository contains inference code and model weights.
## Table of Contents
- [Installation](#installation)
- [Sampling structures](#sampling-structures)
- [Steering to avoid chain breaks and clashes](#steering-to-avoid-chain-breaks-and-clashes)
- [Azure AI Foundry](#azure-ai-foundry)
- [Training data](#training-data)
- [Get in touch](#get-in-touch)
Expand Down Expand Up @@ -66,6 +67,56 @@ By default, unphysical structures (steric clashes or chain discontinuities) will

This code only supports sampling structures of monomers. You can try to sample multimers using the [linker trick](https://x.com/ag_smith/status/1417063635000598528), but in our limited experiments, this has not worked well.

## Steering to avoid chain breaks and clashes

BioEmu includes a [steering system](https://arxiv.org/abs/2501.06848) that uses [Sequential Monte Carlo (SMC)](https://www.stats.ox.ac.uk/~doucet/doucet_defreitas_gordon_smcbookintro.pdf) to guide the diffusion process toward more physically plausible protein structures.
Empirically, using three (or up to 10) steering particles per output sample greatly reduces the number of unphysical samples (steric clashes or chain breaks) produced by the model.
Steering applies potential energy functions during denoising to favor conformations that satisfy physical constraints.
Algorithmically, steering simulates multiple *candidate samples* per desired output sample and resamples between these particles according to the favorability of the provided potentials.

### Quick start with steering

Enable steering with physical constraints using the CLI:

```bash
python -m bioemu.sample \
--sequence GYDPETGTWG \
--num_samples 100 \
--output_dir ~/steered-samples \
--steering_config src/bioemu/config/steering/physical_steering.yaml \
--denoiser_config src/bioemu/config/denoiser/stochastic_dpm.yaml
```

Or using the Python API:

```python
from bioemu.sample import main as sample

sample(
sequence='GYDPETGTWG',
num_samples=100,
output_dir='~/steered-samples',
denoiser_config="../src/bioemu/config/denoiser/stochastic_dpm.yaml", # Use stochastic DPM
steering_config="../src/bioemu/config/steering/physicality_steering.yaml", # Use physicality steering
)
```

### Key steering parameters

- `num_steering_particles`: Number of particles per sample (1 = no steering, >1 enables steering)
- `steering_start_time`: When to start steering (0.0-1.0, default: 0.1) with reverse sampling 1 -> 0
- `steering_end_time`: When to stop steering (0.0-1.0, default: 0.) with reverse sampling 1 -> 0
- `resampling_interval`: How often to resample particles (default: 1)
- `steering_config`: Path to potentials configuration file (required for steering)

### Available potentials

The [`physical_steering.yaml`](./src/bioemu/config/steering/physical_steering.yaml) configuration provides potentials for physical realism:
- **ChainBreak**: Prevents backbone discontinuities
- **ChainClash**: Avoids steric clashes between non-neighboring residues
- **DisulfideBridge**: Encourages disulfide bond formation between specified cysteine pairs

You can create a custom `steering_config.yaml` YAML file instantiating your own potential to steer the system with your own potentials.

## Azure AI Foundry
BioEmu is also available on [Azure AI Foundry](https://ai.azure.com/). See [How to run BioEmu on Azure AI Foundry](AZURE_AI_FOUNDRY.md) for more details.
Expand Down
111 changes: 111 additions & 0 deletions notebooks/disulfide_bridge_steering_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
"""Script to compare sampling with and without physicality steering."""

import logging
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import torch

from bioemu.sample import main as sample_main

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


def bridge_distances(pos: torch.Tensor, bridge_indices: list[tuple[int, int]]) -> torch.Tensor:
"""Compute Ca-Ca distances for specified disulfide bridge indices.

Args:
pos (torch.Tensor): Tensor of shape (N, L, 3)
"""
import torch

distances = []
for i, j in bridge_indices:
dist_ij = torch.norm(pos[:, i, :] - pos[:, j, :], dim=-1) # (N,)
distances.append(dist_ij)
return torch.stack(distances, dim=-1) # (N, num_bridges)


if __name__ == "__main__":

# https://www.uniprot.org/uniprotkb/P01542/entry#sequences
# TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN
# PTM = [(3,40), (4,32), (16, 26)]
bridge_indices = [(2, 39), (3, 31), (15, 25)] # adjusted by -1 to be 0-indexed

"""Sample 128 structures with and without physicality steering."""

# Configuration
sequence = "TTCCPSIVARSNFNVCRLPGTPEALCATYTGCIIIPGATCPGDYAN" # Example sequence
num_samples = 128
base_output_dir = Path("comparison_outputs_disulfide")

# Sample WITHOUT steering
logger.info("=" * 80)
logger.info("Sampling WITHOUT steering...")
logger.info("=" * 80)
output_dir_no_steering = base_output_dir / "no_steering"
sample_main(
sequence=sequence,
num_samples=num_samples,
output_dir=output_dir_no_steering,
batch_size_100=500,
denoiser_config="../src/bioemu/config/denoiser/stochastic_dpm.yaml", # Use stochastic DPM
steering_config=None, # No steering
)
pos_unsteered = torch.from_numpy(
np.load(list(output_dir_no_steering.glob("batch_*.npz"))[0])["pos"]
)

unsteered_bridge_distances = bridge_distances(pos_unsteered, bridge_indices)

# Sample WITH steering
logger.info("=" * 80)
logger.info("Sampling WITH physicality steering...")
logger.info("=" * 80)
output_dir_with_steering = base_output_dir / "with_steering"
sample_main(
sequence=sequence,
num_samples=num_samples,
output_dir=output_dir_with_steering,
denoiser_config="../src/bioemu/config/denoiser/stochastic_dpm.yaml", # Use stochastic DPM
steering_config="../src/bioemu/config/steering/disulfide_bridge_steering.yaml", # Use disulfide bridge steering
)

pos_steered = torch.from_numpy(
np.load(list(output_dir_with_steering.glob("batch_*.npz"))[0])["pos"]
)

steered_bridge_distances = bridge_distances(
pos_steered, bridge_indices
) # pos_rot_steered in Angstrom
logger.info("=" * 80)
logger.info("Comparison complete!")
logger.info(f"Results without steering: {output_dir_no_steering}")
logger.info(f"Results with steering: {output_dir_with_steering}")
logger.info("=" * 80)

# Distances are in Angstrom
fig, ax = plt.subplots(1, 2, figsize=(16, 8))
ax[0].hist(
unsteered_bridge_distances.numpy().flatten(), bins=50, alpha=0.5, label="No Steering"
)
ax[0].hist(
steered_bridge_distances.numpy().flatten(), bins=50, alpha=0.5, label="With Steering"
)
ax[0].legend()
ax[0].set_xlim(0, 5)
ax[0].set_xlabel("Cα-Cα Distance (nM)")
ax[0].grid()
ax[1].hist(
unsteered_bridge_distances.numpy().flatten(), bins=100, alpha=0.5, label="No Steering"
)
ax[1].hist(
steered_bridge_distances.numpy().flatten(), bins=100, alpha=0.5, label="With Steering"
)
ax[1].legend()
ax[1].set_xlim(0.25, 1)
ax[1].set_xlabel("Cα-Cα Distance (nM)")
ax[1].grid()
54 changes: 54 additions & 0 deletions notebooks/physical_steering_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
"""Script to compare sampling with and without physicality steering."""

import logging
from pathlib import Path

from bioemu.sample import main as sample_main

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


def main():
"""Sample 128 structures with and without physicality steering."""

# Configuration
sequence = "MTEIAQKLKESNEPILYLAERYGFESQQTLTRTFKNYFDVPPHKYRMTNMQGESRFLHPL" # Example sequence
num_samples = 128
base_output_dir = Path("comparison_outputs")

# Sample WITHOUT steering
logger.info("=" * 80)
logger.info("Sampling WITHOUT steering...")
logger.info("=" * 80)
output_dir_no_steering = base_output_dir / "no_steering"
sample_main(
sequence=sequence,
num_samples=num_samples,
output_dir=output_dir_no_steering,
denoiser_config="../src/bioemu/config/denoiser/stochastic_dpm.yaml", # Use stochastic DPM
steering_config=None, # No steering
)

# Sample WITH steering
logger.info("=" * 80)
logger.info("Sampling WITH physicality steering...")
logger.info("=" * 80)
output_dir_with_steering = base_output_dir / "with_steering"
sample_main(
sequence=sequence,
num_samples=num_samples,
output_dir=output_dir_with_steering,
denoiser_config="../src/bioemu/config/denoiser/stochastic_dpm.yaml", # Use stochastic DPM
steering_config="../src/bioemu/config/steering/physical_steering.yaml", # Use physicality steering
)

logger.info("=" * 80)
logger.info("Comparison complete!")
logger.info(f"Results without steering: {output_dir_no_steering}")
logger.info(f"Results with steering: {output_dir_with_steering}")
logger.info("=" * 80)


if __name__ == "__main__":
main()
Loading
Loading