Skip to content

Conversation

@ludwigwinkler
Copy link
Member

Add Steering functionality to BioEmu

ludwigwinkler and others added 28 commits July 21, 2025 14:16
- Enhanced the `steering.py` module with additional plotting functions for Ca-Ca distances and clashes.
- Introduced a new `steering_run.py` script for testing sample generation with steering, utilizing Hydra for configuration management.
- Created a scratch pad script for testing loss functions visually.
- Updated the test suite in `test_steering.py` to include WandB logging and improved configuration handling with Hydra.
- Removed deprecated code and organized potential classes for better clarity and maintainability.
- Introduced a new `bioemu.mdc` file containing development guidelines for the BioEMU project, covering molecular dynamics, blob storage, curated MD data structure, file paths, analysis patterns, and error handling.
- Added a `load_md.py` script demonstrating how to interact with blob storage and load molecular dynamics data, including trajectory analysis using MDTraj.
- Updated the `run_steering_comparison.py` script to iterate over different particle counts for steering experiments, improving configurability and analysis capabilities.
- Enhanced the `denoiser.py` with a new `euler_maruyama_denoiser` function and integrated it into the testing framework.
- Updated configuration files for steering and denoising to reflect new parameters and functionalities.
…odule

- Updated the `bioemu.mdc` file to provide a comprehensive development guide, including architectural principles, design patterns, and implementation guidelines for the BioEMU project.
- Added a new `analytical_diffusion.py` module that implements a time-dependent Gaussian Mixture Model for analytical diffusion, including functionality for forward and reverse diffusion processes.
- Refactored the `load_md.py` script by removing unused imports to streamline the code.
- Enhanced the `run_steering_comparison.py` script to improve configurability and analysis of steering experiments, including adjustments to plotting and data handling.
- Introduced a new `stratified_sampling.py` module with tests for stratified resampling functionality.
- Added a `sweep_analysis.py` module for analyzing sweep data from Weights and Biases, including visualization of results.
- Updated the `steering.yaml` configuration to reflect changes in potential parameters for steering functionality.
…n, and denoiser scripts

- Changed import statement for tqdm from `tqdm.auto` to `tqdm` for consistency across modules.
- Added plt.show() in analyze_termini_distribution function to ensure plots are displayed.
- Commented out plt.show() in main function to prevent automatic display during batch processing.
- Refactored steering module to include ChainBreakPotential and ChainClashPotential, replacing previous distance potentials.
- Updated run_steering_comparison.py to refine steering configurations, including adjustments to num_samples and particle counts.
- Implemented fast steering optimization to delay particle creation until steering start time for improved performance.
- Added validation for steering configurations and assertions to ensure expected steering execution.
- Introduced comprehensive tests for new steering features, including physical and fast steering capabilities.
- Updated steering.yaml configuration to reflect new potential parameters and added end time for steering.
- Introduced a new section in the README for "Steering for Enhanced Physical Realism," detailing the use of Sequential Monte Carlo for guiding protein structure diffusion.
- Added example commands for enabling steering via CLI and Python API, including key parameters and potential configurations.
- Created a new `hydra_run.py` script for running BioEMU sampling with Hydra configuration management, allowing for easier experimentation with steering parameters.
- Updated existing scripts to reflect changes in steering configuration, including renaming parameters for clarity and consistency.
- Added a new `README_hydra_run.md` to document the usage of the Hydra-based entry point.
- Implemented tests for CLI integration, ensuring that steering functionality works as expected through command-line parameters.
- Updated the README to clarify the steering process, including the default behavior for steering potentials and the use of multiple particles.
- Removed references to the now-optional `steering_potentials_config` parameter in example commands and clarified its default behavior.
- Enhanced the sample.py script to load default steering potentials when no custom configuration is provided, improving usability.
- Added warnings for missing default configuration files to aid in troubleshooting.
…ctions

- Changed the section title from "Steering for Enhanced Physical Realism" to "Steering structures" for clarity.
- Updated CLI instructions to specify the requirement of setting `--num_steering_particles` to greater than 1 for enabling steering.
- Removed Python API example for steering to streamline the documentation and focus on CLI usage.
…ydra configuration section

- Added a Python API example for steering, demonstrating how to use the `bioemu.sample` module.
- Removed the section detailing the Hydra configuration interface to streamline the documentation and focus on the primary usage methods.
- Added an entry to .gitignore to ignore all files in the docs directory, preventing them from being tracked by Git.
- Updated the `run_steering_comparison.py` and `run_guidance_steering_comparison.py` scripts to streamline steering configuration handling.
- Introduced a new `DisulfideBridgePotential` class for guiding disulfide bridge formation, including parameters for specified cysteine pairs.
- Added a new configuration file for disulfide steering and updated existing steering configurations to reflect changes in potential definitions.
- Enhanced the `sample.py` module to support the new steering configuration structure, allowing for better integration of disulfide bridge steering.
- Implemented tests for the `DisulfideBridgePotential` to ensure correct functionality and energy calculations.
- Introduced new guidance steering configuration and potential for enhanced structural constraints.
- Updated `run_guidance_steering_comparison.py` to support three-way comparison: no steering, resampling only, and guidance steering.
- Added new binary images for visualization of steering comparisons.
- Refactored `run_steering_experiment` to accommodate the new experiment type parameter.
- Enhanced analysis functions to compare termini distances and KL divergence across different steering methods.
- Updated existing steering configurations to include guidance steering options and parameters.
- Improved error handling and logging for better user feedback during experiments.
…urations

- Updated `run_guidance_steering_comparison.py` to streamline handling of guidance steering alongside resampling.
- Refactored steering configuration to improve clarity and functionality, including adjustments to learning rates and steps for guidance.
- Modified `dpm_solver` to apply gradient guidance more effectively and ensure correct score calculations.
- Increased sample size in main configuration for better statistical analysis during experiments.
- Updated binary image for visualization of steering comparisons.
- Updated `disulfide_steering_example.py` to include comprehensive steering configurations and improved error handling for guidance steering.
- Added functionality to create and demonstrate various steering configurations, including no steering, resampling only, and guidance steering.
- Enhanced statistical analysis and visualization of Cα-Cα distances for disulfide bridge pairs across different steering methods.
- Introduced new binary images for output visualization of steering comparisons.
- Refactored `run_guidance_steering_comparison.py` to support dynamic parameter adjustments for guidance strength and particle counts during experiments.
@ludwigwinkler ludwigwinkler self-assigned this Oct 15, 2025
@ludwigwinkler ludwigwinkler changed the title Luwinkler/cli steering luwinkler: Steering for BioEmu Oct 15, 2025
@ludwigwinkler ludwigwinkler requested a review from nw13slx January 20, 2026 09:52
This commit introduces a new test suite in `test_integration.py` that verifies the basic command from the README, steering functionality via CLI parameters, and the processing of steering parameters. The tests ensure that output files are generated correctly and that the integration works end-to-end.
@github-actions
Copy link

Summary

Summary
Generated on: 01/20/2026 - 10:38:21
Parser: Cobertura
Assemblies: 6
Classes: 26
Files: 26
Line coverage: 74.2% (2176 of 2929)
Covered lines: 2176
Uncovered lines: 753
Coverable lines: 2929
Total lines: 9631
Covered branches: 0
Total branches: 0
Method coverage: Feature is only available for sponsors

Coverage

src.bioemu - 87.4%
Name Line Branch
src.bioemu 87.4% ****
init.py 100%
chemgraph.py 100%
convert_chemgraph.py 97.5%
denoiser.py 98.2%
get_embeds.py 68.6%
md_utils.py 85.8%
model_utils.py 78%
models.py 94.1%
run_hpacker.py 0%
sample.py 88.3%
sde_lib.py 86.6%
seq_io.py 100%
shortcuts.py 100%
sidechain_relax.py 77.4%
so3_sde.py 91.7%
steering.py 76.7%
structure_module.py 84.3%
utils.py 65.6%
src.bioemu.colabfold_setup -
Name Line Branch
src.bioemu.colabfold_setup **** ****
init.py
src.bioemu.hpacker_setup - 58.8%
Name Line Branch
src.bioemu.hpacker_setup 58.8% ****
init.py
setup_hpacker.py 58.8%
src.bioemu.openfold.np - 44%
Name Line Branch
src.bioemu.openfold.np 44% ****
protein.py 31.2%
residue_constants.py 60.7%
src.bioemu.openfold.utils - 50.1%
Name Line Branch
src.bioemu.openfold.utils 50.1% ****
rigid_utils.py 50.1%
src.bioemu.training - 100%
Name Line Branch
src.bioemu.training 100% ****
foldedness.py 100%
loss.py 100%

@ludwigwinkler
Copy link
Member Author

@microsoft-github-policy-service agree [company="Microsoft Research"]

@ludwigwinkler
Copy link
Member Author

@microsoft-github-policy-service agree company="Microsoft"

…in a main guard

This commit restructures the `disulfide_bridge_steering_example.py` file by adding a main guard to encapsulate the sampling logic for structures with and without physicality steering. This improves code organization and allows for easier execution as a standalone script.
@github-actions
Copy link

Summary

Summary
Generated on: 01/20/2026 - 15:38:09
Parser: Cobertura
Assemblies: 6
Classes: 26
Files: 26
Line coverage: 74.2% (2172 of 2925)
Covered lines: 2172
Uncovered lines: 753
Coverable lines: 2925
Total lines: 9617
Covered branches: 0
Total branches: 0
Method coverage: Feature is only available for sponsors

Coverage

src.bioemu - 87.3%
Name Line Branch
src.bioemu 87.3% ****
init.py 100%
chemgraph.py 100%
convert_chemgraph.py 97.5%
denoiser.py 98.2%
get_embeds.py 68.6%
md_utils.py 85.8%
model_utils.py 78%
models.py 94.1%
run_hpacker.py 0%
sample.py 88.3%
sde_lib.py 86.6%
seq_io.py 100%
shortcuts.py 100%
sidechain_relax.py 77.4%
so3_sde.py 91.7%
steering.py 76.7%
structure_module.py 84.3%
utils.py 65.6%
src.bioemu.colabfold_setup -
Name Line Branch
src.bioemu.colabfold_setup **** ****
init.py
src.bioemu.hpacker_setup - 58.8%
Name Line Branch
src.bioemu.hpacker_setup 58.8% ****
init.py
setup_hpacker.py 58.8%
src.bioemu.openfold.np - 44%
Name Line Branch
src.bioemu.openfold.np 44% ****
protein.py 31.2%
residue_constants.py 60.7%
src.bioemu.openfold.utils - 50.1%
Name Line Branch
src.bioemu.openfold.utils 50.1% ****
rigid_utils.py 50.1%
src.bioemu.training - 100%
Name Line Branch
src.bioemu.training 100% ****
foldedness.py 100%
loss.py 100%

@github-actions
Copy link

Summary

Summary
Generated on: 01/20/2026 - 16:07:40
Parser: Cobertura
Assemblies: 6
Classes: 26
Files: 26
Line coverage: 74.1% (2165 of 2918)
Covered lines: 2165
Uncovered lines: 753
Coverable lines: 2918
Total lines: 9615
Covered branches: 0
Total branches: 0
Method coverage: Feature is only available for sponsors

Coverage

src.bioemu - 87.3%
Name Line Branch
src.bioemu 87.3% ****
init.py 100%
chemgraph.py 100%
convert_chemgraph.py 97.5%
denoiser.py 98.1%
get_embeds.py 68.6%
md_utils.py 85.8%
model_utils.py 78%
models.py 94.1%
run_hpacker.py 0%
sample.py 88.3%
sde_lib.py 86.6%
seq_io.py 100%
shortcuts.py 100%
sidechain_relax.py 77.2%
so3_sde.py 91.7%
steering.py 76.7%
structure_module.py 84.3%
utils.py 65.6%
src.bioemu.colabfold_setup -
Name Line Branch
src.bioemu.colabfold_setup **** ****
init.py
src.bioemu.hpacker_setup - 58.8%
Name Line Branch
src.bioemu.hpacker_setup 58.8% ****
init.py
setup_hpacker.py 58.8%
src.bioemu.openfold.np - 44%
Name Line Branch
src.bioemu.openfold.np 44% ****
protein.py 31.2%
residue_constants.py 60.7%
src.bioemu.openfold.utils - 50.1%
Name Line Branch
src.bioemu.openfold.utils 50.1% ****
rigid_utils.py 50.1%
src.bioemu.training - 100%
Name Line Branch
src.bioemu.training 100% ****
foldedness.py 100%
loss.py 100%

@github-actions
Copy link

Summary

Summary
Generated on: 01/20/2026 - 16:24:10
Parser: Cobertura
Assemblies: 6
Classes: 26
Files: 26
Line coverage: 74.1% (2165 of 2918)
Covered lines: 2165
Uncovered lines: 753
Coverable lines: 2918
Total lines: 9615
Covered branches: 0
Total branches: 0
Method coverage: Feature is only available for sponsors

Coverage

src.bioemu - 87.3%
Name Line Branch
src.bioemu 87.3% ****
init.py 100%
chemgraph.py 100%
convert_chemgraph.py 97.5%
denoiser.py 98.1%
get_embeds.py 68.6%
md_utils.py 85.8%
model_utils.py 78%
models.py 94.1%
run_hpacker.py 0%
sample.py 88.3%
sde_lib.py 86.6%
seq_io.py 100%
shortcuts.py 100%
sidechain_relax.py 77.2%
so3_sde.py 91.7%
steering.py 76.7%
structure_module.py 84.3%
utils.py 65.6%
src.bioemu.colabfold_setup -
Name Line Branch
src.bioemu.colabfold_setup **** ****
init.py
src.bioemu.hpacker_setup - 58.8%
Name Line Branch
src.bioemu.hpacker_setup 58.8% ****
init.py
setup_hpacker.py 58.8%
src.bioemu.openfold.np - 44%
Name Line Branch
src.bioemu.openfold.np 44% ****
protein.py 31.2%
residue_constants.py 60.7%
src.bioemu.openfold.utils - 50.1%
Name Line Branch
src.bioemu.openfold.utils 50.1% ****
rigid_utils.py 50.1%
src.bioemu.training - 100%
Name Line Branch
src.bioemu.training 100% ****
foldedness.py 100%
loss.py 100%

Copy link

@nw13slx nw13slx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nw13slx nw13slx merged commit 8d65c86 into main Jan 21, 2026
4 checks passed
@nw13slx nw13slx deleted the luwinkler/cli_steering branch January 21, 2026 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants