MPNN AA type bias (#8)
Initial plan
Expose bias matrix inputs for MPNN
Co-authored-by: JinyuanSun 33344948+JinyuanSun@users.noreply.github.com
- Refine bias map assignment
Co-authored-by: JinyuanSun 33344948+JinyuanSun@users.noreply.github.com
- Optimize bias application
Co-authored-by: JinyuanSun 33344948+JinyuanSun@users.noreply.github.com
- Wire bias matrix into vanilla MPNN CLI
Co-authored-by: JinyuanSun 33344948+JinyuanSun@users.noreply.github.com
- Tighten bias docs and errors
Co-authored-by: JinyuanSun 33344948+JinyuanSun@users.noreply.github.com
- Add bias matrix JSON example
Co-authored-by: JinyuanSun 33344948+JinyuanSun@users.noreply.github.com
- Fix bias matrix example indices
Co-authored-by: JinyuanSun 33344948+JinyuanSun@users.noreply.github.com
expose mpnn bias
Update README to remove bias JSON examples
Removed examples for bias matrix JSON and bias AA JSON from README.
Co-authored-by: copilot-swe-agent[bot] 198982749+Copilot@users.noreply.github.com
PXDesignBench: A Unified Evaluation Suite for Protein Design
This repository provides a comprehensive suite of tools for protein design evaluation, integrating multiple state-of-the-art models with standardized pipelines. It supports both monomer and binder design, enabling thorough assessment across diverse aspects of protein design.
📂 Repository Structure
The codebase is organized into three main components:
metrics: Scripts for evaluating multiple aspects of protein design, including sequence quality, structure quality, and designability.tasks: Pipelines for executing specific protein design evaluations (e.g., monomer, binder).tools: Wrappers for external models (e.g., Protenix, ProteinMPNN, AlphaFold2, ESMFold) to streamline integration.Supported Tasks & Tools
🔹 Protenix
📦 Installation
PXDesignBench supports two installation methods:
✅ One-Click Installation Script (Recommended)
We provide an installation script
install.shthat sets up an environment and installs all dependencies.What the installer will do
Supported options
Example:
🐳 Docker-Based Installation
Step 1: Build the Docker Image
Step 2: Start the Container
Step 3: Install PXDesignBench in the Container
Inside the container:
📥 Download Required Model Weights (Required)
PXDesignBench relies on several external pretrained models (e.g., AF2, ProteinMPNN, etc.) for evaluation.
These weights are not bundled with the Python package and must be downloaded manually.
After installing PXDesignBench, run:
This script will automatically download and organize all required pretrained weights for:
Model weights for external tools are expected to be organized in a directory as follows:
Note: Required Protenix files (weights, CCD files, etc.) will be auto-downloaded on the first evaluation run.
🚀 Running the Evaluation
We provide demo scripts for both monomer and binder design evaluation. Monomer evaluation example:
Binder evaluation example:
Input Formats
PXDesignBench supports multiple input modes, allowing you to evaluate protein designs flexibly.
The basic CLI arguments are:
--data_dir: Directory containing input structures.--dump_dir: Output directory for evaluation results.--is_mmcif: Flag indicating whether input files are in mmCIF format (otherwise assumed PDB).JSON-based Input One can also provide a JSON configuration file describing the evaluation task. This format allows fine-grained control over task parameters and is particularly useful for batch evaluation.
Example JSON:
Key points:
binder_chainsto be explicitly specified.pdb_namesdefines the exact structures to evaluate. If omitted, all files inpdb_dirwith valid suffixes will be evaluated.Directory-based Input Instead of JSON, one may provide a directory path directly to
--data_dir. In this case:file_name_listis provided, only matching files will be evaluated.ProteinMPNN Bias Control
PXDesignBench exposes ProteinMPNN bias inputs through the task configs. Both monomer and binder configs accept:
bias_aa_json: JSON file mapping amino acid letters to bias values applied at all positions.Example usage:
Binder Evaluation with Additional Metadata
Binder evaluation supports passing a JSON file to specify additional metadata beyond the default inputs.
This is useful for advanced scenarios such as:
Evaluating cropped sequences
cropfield to specify the range used in evaluation."1-120,130-150"(comma-separated ranges, 1-based indexing, inclusive).Providing precomputed MSA for Protenix filter
msafield with:precomputed_msa_dir: Path to the local MSA directory.pairing_db:uniref100.Example JSON input:
Multi-GPU / Distributed Evaluation
PXDesignBench exposes device IDs for each integrated model, enabling:
For example, the following is a pseudocode snippet illustrating online evaluation tracking in the DDP model training pipeline:
Evaluation Process
use_gt_seq=True, the sequence from the input structure is used directly.use_gt_seq=False, the tool will first run the assigned sequence generation model (e.g., ProteinMPNN) to generate sequences.Post-processing
PXDesignBench provides scripts for analyzing the diversity and novelty of generated protein structures. To enable Foldseek-based diversity and novelty calculations, you must first install Foldseek, a structural alignment and similarity search tool.
Foldseek is not bundled with PXDesignBench and must be installed separately.
Please follow the official guide here: Foldseek Installation.
Examples:
📚 Citing Related Work
If you use this repository, please cite the following works:
PXDesign
Protenix
ProteinMPNN
ESMFold
AlphaFold2
Code of Conduct
We are committed to fostering a welcoming and inclusive environment. Please review our Code of Conduct for guidelines on how to participate respectfully.
Security
If you discover a potential security issue in this project, or think you may have discovered a security issue, we ask that you notify Bytedance Security via our security center or vulnerability reporting email.
Please do not create a public GitHub issue.
License
This project is licensed under the Apache 2.0 License. It is free for both academic research and commercial use. \i}dek, Augustin and Potapenko, Anna and others}, journal={nature}, volume={596}, number={7873}, pages={583–589}, year={2021}, publisher={Nature Publishing Group UK London} }
Code of Conduct
We are committed to fostering a welcoming and inclusive environment. Please review our Code of Conduct for guidelines on how to participate respectfully.
Security
If you discover a potential security issue in this project, or think you may have discovered a security issue, we ask that you notify Bytedance Security via our security center or vulnerability reporting email.
Please do not create a public GitHub issue.
License
This project is licensed under the Apache 2.0 License. It is free for both academic research and commercial use.