Why Peptides Became the Hottest Drug Category, and What AI Had to Do With It

David Borish
Apr 14
7 min read

For most of the last century, peptides sat in an awkward middle ground of drug development. They were too large and fragile to behave like small-molecule pills, and too short and unstable to carry the commercial weight of antibody biologics. Insulin worked, and a handful of others found niches, but peptides were mostly a pharmacology of special cases. That has changed fast, and the change is visible in two places at once. It is visible in Novo Nordisk's revenue, where semaglutide alone is tracking toward roughly $36 billion in 2026. And it is visible in a paper just published in Nature Communications, where a hybrid AI framework designed a 12-amino-acid peptide that kills the dormant form of MRSA that conventional antibiotics cannot touch.

These two stories belong together. The cultural moment for peptides and the technical moment for AI-driven peptide design are the same moment.

The Market Caught Up First

The GLP-1 class did the cultural work. By early 2026, roughly 30 million people were using drugs in this family, up from about 4 million at the start of the decade. Novo Nordisk and Eli Lilly dominate the category with semaglutide and tirzepatide, and the global GLP-1 market is valued between $58 billion and $74 billion in 2026 depending on the analyst, with forecasts stretching toward $250 billion or more by the mid-2030s. The Wegovy pill launched in January 2026, removing the injection barrier that had kept a large share of the patient population on the sidelines. Novo sold 50,000 weekly subscriptions in the first three weeks.

What the GLP-1 story did for the broader peptide category is more important than the revenue. It proved that a peptide could become a mass-market chronic-disease medication rather than a specialty product. It also created the commercial gravity to pull investment into every adjacent problem peptides might solve. Cancer vaccines. Antimicrobial resistance. Metabolic disease beyond diabetes. Peptides targeting the protein-protein interactions that small molecules historically called undruggable. The momentum behind one category became momentum behind a whole modality.

This is where AI enters the picture, because the bottleneck in exploiting that momentum has always been the same: peptide sequence space is impossibly large, and traditional screening cannot cover it.

The Numbers That Make AI Necessary

A peptide ten amino acids long has about 20^10 possible sequences, which is roughly 10 trillion. A fifteen-mer has around 30 quadrillion. Experimentalists have never been able to search more than a tiny corner of this space, and rational design based on known scaffolds has historically relied on chemical intuition and a lot of failed batches. Peptide drug discovery was slow for the same reason early protein structure prediction was slow. The search space swallowed every available technique.

Machine learning changed the economics of that search in several ways at once. Predictive models trained on databases of peptides with measured properties can screen millions of candidates in silico before any are synthesized. Generative models, including variational autoencoders, diffusion models, and protein language models adapted from the same architectures that power chatbots, can propose novel sequences with targeted properties. Structure prediction tools descended from AlphaFold can model the three-dimensional shape of a proposed peptide, which matters because peptides often work by folding into specific conformations at the right moment. And optimization loops that couple prediction, generation, and ranking can iterate faster than any human chemist.

The result is a shift from library screening to rational, data-driven design. This is roughly the same shift AI produced in protein structure prediction between 2018 and 2022, and it is happening on a compressed timeline because the tools and the training data were already in place.

What a Working AI Peptide Pipeline Looks Like

The CAMPER paper in Nature Communications is useful here because it makes the architecture concrete. A team led by researchers at Houston Methodist Hospital built a two-stage system. The first stage is a random forest classifier trained on 3,660 peptide sequences from a public antimicrobial peptide database, each labeled active or inactive based on measured activity against Staphylococcus aureus. The classifier learns statistical patterns linking sequence features to activity. This part is standard machine learning.

The second stage is what the authors call a biophysical scoring function. It re-ranks the classifier's top candidates using four physical parameters that govern how a peptide disrupts a bacterial membrane: net charge, hydrophobicity, amphipathicity, and helical structure. The scoring function is not learned from data. It encodes what biophysicists already know about how membrane-active peptides work. The two stages together produced a candidate called WP-CAMPER1, a 12-amino-acid peptide that killed MRSA at a minimum inhibitory concentration of 4 micrograms per milliliter, about eight times more potent than the natural mastoparan peptides the library was derived from. In a mouse skin infection model, a 2% topical formulation reduced S. aureus burden by 2.5 log10.

The reason this matters beyond one antibiotic candidate is that it illustrates the direction the whole field is moving. Pure end-to-end machine learning identifies candidates that look active statistically. Hybrid approaches that combine learned classifiers with domain knowledge produce candidates that are also physically plausible in the context the drug has to work in. The CAMPER authors reported a 77% drop in false positives when the biophysical scorer was added to their classifier. That is the kind of efficiency gain that makes experimental pipelines affordable.

The Broader AI Peptide Ecosystem

CAMPER is one entry in a wave of similar work. A team published in Nature Microbiology in 2025 built a protein large language model called ProteoGPT and used it to screen hundreds of millions of sequences for activity against carbapenem-resistant bacteria. A Nature Materials study late in 2025 described self-assembling antimicrobial peptides designed by deep learning that cleared drug-resistant infections in mouse models while leaving human cells intact. Academic groups including David Baker's lab at the Institute for Protein Design have released tools like RFpeptides for designing cyclic peptides that bind specific disease targets. On the commercial side, AstraZeneca has built a peptide version of its generative chemistry platform called PepINVENT, and smaller companies like Peptilogics, Nuritas, and Profluent are running AI-native peptide discovery at scale.

The metabolic-disease side has its own AI pipeline. ImmunoPrecise Antibodies reported in mid-2025 that AI-designed GLP-1 receptor agonist sequences matched or exceeded semaglutide in laboratory receptor activation assays. Multi-agonist peptides that simultaneously hit GLP-1, GIP, and glucagon receptors are in active development, and the design of these multi-target molecules is exactly the kind of problem classical medicinal chemistry handles slowly and generative models handle well. Tirzepatide, already a dual-agonist, is showing roughly 20% weight loss in trials compared to semaglutide's 13.7%, and the triple-agonist candidates now entering the pipeline were mostly designed with machine-learning support at some stage.

None of these tools has produced a fully AI-designed peptide that has cleared FDA approval yet. The regulatory milestone is still ahead. But the pipeline feeding into clinical trials over the next several years has been reshaped by machine learning in a way that earlier drug generations were not.

The Challenges That Remain

AI has not eliminated the hard parts of peptide drug development. It has moved the bottleneck. The sequence-space problem is mostly solved. The synthesis problem is partly solved, with deep learning now helping predict aggregation risks and failed deprotection steps in solid-phase synthesis. The property-prediction problem is improving quickly, with models that can forecast stability, toxicity, cell penetration, and pharmacokinetics from sequence.

What remains difficult is everything downstream. Peptides are still fragile in serum. Oral bioavailability is still the exception rather than the rule, which is why the oral Wegovy launch counts as a major event. Manufacturing at mass-market scale creates its own set of constraints, and the supply-chain crisis that hit the GLP-1 category in 2023 and 2024 was a demonstration of how quickly demand can outrun production capacity for complex molecules. Regulatory frameworks for AI-generated therapeutics are still being written, and the questions about data quality, model bias, and reproducibility that apply to AI in every other domain apply here too. A "hallucinated" peptide that looks plausible in silico and fails in the body is a real and common outcome.

The field is also still calibrating how much of the discovery process AI should run end-to-end versus how much should be human-directed. The CAMPER authors explicitly chose to encode biophysical knowledge into their scoring function rather than trust a classifier to discover it from data. Other groups are taking the opposite bet, arguing that sufficiently large generative models will eventually absorb the physics without being told. Both approaches are producing interesting results. The answer may depend on the target.

The Broader Pattern

Something worth noting is that peptides are following a trajectory visible in several other areas of computational biology. Protein structure prediction spent decades as a specialist problem before AlphaFold collapsed the timeline between input sequence and three-dimensional model. Antibody design has moved along a similar curve. mRNA vaccine design during the pandemic showed what happens when a biological modality meets mature computational tools at the right moment. Peptides are at that moment now. The ingredients have been in place for a few years, and the results are starting to appear in papers, pipelines, and quarterly earnings.

For anyone trying to understand where drug discovery is going in the second half of this decade, the peptide category is the clearest view available. It has a mass-market product driving commercial investment, a technical pipeline that AI has materially accelerated, a list of adjacent disease areas where the same tools can be applied, and a set of unsolved problems that are concrete enough to measure progress against. Watching how it develops will tell us something about the other biological modalities AI has started to work on.

The GLP-1 patient filling a Wegovy prescription and the CAMPER-designed peptide killing dormant MRSA are on the same curve. They arrived at different points on it, but the slope that produced them is the same slope, and it is steepening.

DAVID BORISH