Instructions to use ChatterjeeLab/PepMLM-650M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ChatterjeeLab/PepMLM-650M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="ChatterjeeLab/PepMLM-650M")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("ChatterjeeLab/PepMLM-650M") model = AutoModelForMaskedLM.from_pretrained("ChatterjeeLab/PepMLM-650M") - Inference
- Notebooks
- Google Colab
- Kaggle
| license: mit | |
| extra_gated_fields: | |
| Name: text | |
| Company: text | |
| Country: country | |
| Specific date: date_picker | |
| I want to use this model for: | |
| type: select | |
| options: | |
| - Research | |
| - Education | |
| - label: Other | |
| value: other | |
| I agree to include the authors of the code (Tianlai Chen and Pranam Chatterjee) as authors on manuscripts with data from designed peptides: checkbox | |
| I agree to share generated sequences and associated data with authors before publishing: checkbox | |
| I agree not to file patents on any sequences generated by this model: checkbox | |
| I agree to use this model for non-commercial use ONLY: checkbox | |
| **PepMLM: Target Sequence-Conditioned Generation of Peptide Binders via Masked Language Modeling** | |
|  | |
| In this work, we introduce **PepMLM**, a purely target sequence-conditioned *de novo* generator of linear peptide binders. | |
| By employing a novel masking strategy that uniquely positions cognate peptide sequences at the terminus of target protein sequences, | |
| PepMLM tasks the state-of-the-art ESM-2 pLM to fully reconstruct the binder region, | |
| achieving low perplexities matching or improving upon previously-validated peptide-protein sequence pairs. | |
| After successful *in silico* benchmarking with AlphaFold-Multimer, we experimentally verify PepMLM’s efficacy via fusion of model-derived peptides to E3 ubiquitin ligase domains, demonstrating endogenous degradation of target substrates in cellular models. | |
| In total, PepMLM enables the generative design of candidate binders to any target protein, without the requirement of target structure, empowering downstream programmable proteome editing applications. | |
| - Demo: HuggingFace Space Demo [Link](https://huggingface.co/spaces/TianlaiChen/PepMLM). | |
| - Colab Notebook: [Link](https://colab.research.google.com/drive/1u0i-LBog_lvQ5YRKs7QLKh_RtI-tV8qM?usp=sharing) | |
| - Preprint: [Link](https://arxiv.org/abs/2310.03842) | |
| - Nature Biotechnology: [Link](https://www.nature.com/articles/s41587-025-02761-2) | |
| ``` | |
| # Load model directly | |
| from transformers import AutoTokenizer, AutoModelForMaskedLM | |
| tokenizer = AutoTokenizer.from_pretrained("TianlaiChen/PepMLM-650M") | |
| model = AutoModelForMaskedLM.from_pretrained("TianlaiChen/PepMLM-650M") | |
| ``` | |
|  |