Skip to content

DeepFoldProtein/af3-input

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AlphaFold 3 Input Generator

This repository provides Python scripts for augmenting AlphaFold 3 input JSON files with Multiple Sequence Alignments (MSAs) generated by either MMseqs2 or plmMSA. These scripts streamline the process of preparing input data for AlphaFold 3 by automating the integration of crucial MSA information.

Installation

To set up the necessary environment, execute the following command to install the required Python packages:

pip install .

Usage

These scripts are designed to process existing AlphaFold 3 input JSON files. An example of a basic input JSON structure is shown below:

{
  "name": "2PV7",
  "sequences": [
    {
      "protein": {
        "id": ["A", "B"],
        "sequence": "GMRESYANENQFGFKTINSDIHKIVIVGGYGKLGGLFARYLRASGYPISILDREDWAVAESILANADVVIVSVPINLTLETIERLKPYLTENMLLADLTSVKREPLAKMLEVHTGAVLGLHPMFGADIASMAKQVVVRCDGRFPERYEWLLEQIQIWGAKIYQTNATEHDHNMTYIQALRHFSTFANGLHLSKQPINLANLLALSSPIYRLELAMIGRLFAQDAELYADIIMDKSENLAVIETLKQTYDEALTFFENNDRQGFIDAFHKVRDWFGDYSEQFLKESRQLLQQANDLKQG"
      }
    }
  ],
  "modelSeeds": [1],
  "dialect": "alphafold3",
  "version": 1
}

Adding MMseqs2 MSA

The mmseqs.py script facilitates the integration of MSAs generated using MMseqs2 into your AlphaFold 3 input JSON.

Command-line usage:

af3-mmseqs <input_json> \
    [--output_json <output_json>] \
    [--host_url <host_url>]

Arguments:

  • <input_json>: Specifies the path to the AlphaFold 3 input JSON file you wish to modify.
  • [--output_json <output_json>]: (Optional) Defines the path for the output JSON file with the added MSA. If omitted, the output will be directed to standard output (/dev/stdout).
  • [--host_url <host_url>]: (Optional) Sets the URL for the MMseqs API server. The default value is https://api.colabfold.com/.

Adding plmMSA MSA

The add_plmmsa_msa.py script enables the addition of MSAs generated using plmMSA to your AlphaFold 3 input JSON.

Command-line usage:

af3-plmmsa --input_json <input_json> \
    [--output_json <output_json>] \
    [--output_a3m <output_a3m>] \
    [--use_pairing]

Arguemnts:

  • --input_json <input_json>: Specifies the path to the AlphaFold 3 input JSON file.
  • [--output_json <output_json>]: (Optional) Sets the output path for the modified JSON file. Defaults to standard output (/dev/stdout).
  • [--host_url <host_url>]: (Optional) Specifies the URL of the MMseqs API server. The default is https://deepfold.com/api/colab.
  • [--use_pairing]: (Optional) A flag to indicate whether paired MSA data should be used.

Examples

af3-plmmsa examples/2PV7.json > input.json

Contribution Guidelines

We welcome contributions to enhance this project! If you'd like to contribute, please follow these guidelines:

  1. Fork the repository: Create your own fork of this repository.
  2. Create a branch: Make your changes in a dedicated branch (e.g., feature/new-functionality or bugfix/issue-123).
  3. Follow coding standards: Adhere to the existing Python coding style (PEP 8 is recommended).
  4. Write tests: If you add new features or fix bugs, please include relevant unit tests to ensure the functionality works as expected.
  5. Document your changes: Update the README.md and any relevant documentation to reflect your contributions.
  6. Submit a pull request: Once you've made your changes and are satisfied, submit a pull request to the main repository. Clearly describe the changes you've made and why they are beneficial.

We appreciate your contributions!

License

MIT License

About

AlphaFold 3 Input Generation with MMseqs2 and plmMSA

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages