| # moPPIt: De Novo Generation of Motif-Specific Peptide Binders via Multi-Objective Discrete Flow Matching | |
| <img src="https://cdn-uploads.huggingface.co/production/uploads/649ef40be56dc456b7a36649/2P-06_gxQ_Dv3x-i1aEWH.png" width="100%"> | |
| Targeting specific functional motifs, whether conserved viral epitopes, intrinsically disordered regions (IDRs), or fusion breakpoints, is essential for modulating protein function and protein-protein interactions (PPIs). Current design methods, however, depend on stable tertiary structures, limiting their utility for disordered or dynamic targets. Here, we present a motif-specific PPI targeting algorithm (moPPIt), a framework for the de novo generation of motif-specific peptide binders derived solely from target sequence data. The core of this approach is BindEvaluator, a transformer architecture that interpolates protein language model embeddings to predict peptide-protein binding site interactions with high accuracy (AUC = 0.97). We integrate this predictor into a novel Multi-Objective-Guided Discrete Flow Matching (MOG-DFM) framework, which steers generative trajectories toward peptides that simultaneously maximize binding affinity and motif specificity. After comprehensive in silico validation of binding and motif-specific targeting, we validate moPPIt in vitro by generating binders that strictly discriminate between the FN3 and IgG domains of NCAM1, confirming domain-level specificity, and further demonstrate precise targeting of IDRs by generating binders specific to the N-terminal disordered domain of β-catenin. In functional, disease-relevant assays, moPPIt-designed peptides targeting the GM-CSF receptor effectively block macrophage polarization. Finally, we demonstrate therapeutic utility in cell engineering, where binders directed against the tumor antigen AGR2 drive specific CAR T regulatory cell activation. In total, moPPIt serves as a purely sequence-based paradigm for controllably targeting the "undruggable" and disordered proteome. | |
| --- | |
| ## 1. Google Colab Notebooks | |
| We provide two Google Colab notebooks to help you run and evaluate moPPIt without any local setup: | |
| - **moPPIt Colab** (generate motif-specific binders while optimizing other therapeutic-related properties): [Link](https://colab.research.google.com/drive/16n8PIwKwAiG-oDLm171BWvv-lQH0dHMg?usp=sharing) | |
| - **PeptiDerive Colab** (compute Relative Interaction Scores (RIS) for residues on the target protein): [Link](https://colab.research.google.com/drive/1aCODZ-WRwhxr-u8nEB6ZrdrhIOTz7-UF?usp=sharing) | |
| --- | |
| ## 2. Command-line Usage | |
| You can also run **moPPIt** and **BindEvaluator** from the command line. | |
| ### 2.1 Run moPPIt | |
| Example command: | |
| ``` | |
| python -u moo.py \ | |
| --output_file './samples.csv' \ | |
| --length 10 \ | |
| --n_batches 600 \ | |
| --weights 1 1 1 4 4 2 \ | |
| --motifs '16-31,62-79' \ | |
| --motif_penalty \ | |
| --objectives Hemolysis Non-Fouling Half-Life Affinity Motif Specificity \ | |
| --target_protein MHVPSGAQLGLRPDLLARRRLKRCPSRWLCLSAAWSFVQVFSEPDGFTVIFSGLGNNAGGTMHWNDTRPAHFRILKVVLREAVAECLMDSYSLDVHGGRRTAAG | |
| ``` | |
| ### 2.2 Run BindEvaluator | |
| BindEvaluator predicts the binding sites on the target protein, given a target protein seqeunce and a binder sequence. | |
| Example command: | |
| ``` | |
| python -u bindevaluator.py \ | |
| -target MHVPSGAQLGLRPDLLARRRLKRCPSRWLCLSAAWSFVQVFSEPDGFTVIFSGLGNNAGGTMHWNDTRPAHFRILKVVLREAVAECLMDSYSLDVHGGRRTAAG \ | |
| -binder YVEICRCVVC \ | |
| -sm ./classifier_ckpt/finetuned_BindEvaluator.ckpt \ | |
| -n_layers 8 \ | |
| -d_model 128 \ | |
| -d_hidden 128 \ | |
| -n_head 8 \ | |
| -d_inner 64 | |
| ``` | |
| ## Repository Authors | |
| [Tong Chen](mailto:[email protected]), PhD Student at University of Pennsylvania | |
| [Pranam Chatterjee](mailto:[email protected]), Assistant Professor at University of Pennsylvania | |
| Reach out to us with any questions! |