Microproteins and the Dark Proteome: Tiny Molecules with Big Potential
January 22, 2025 • Meredith Carpenter, PhD

January 22, 2025 • Meredith Carpenter, PhD – Head of Scientific Affairs
A major goal in modern biology is fully understanding the protein-coding genome, which remains a challenge even 20+ years after the human genome was first sequenced. In recent years, researchers have turned their attention to non-canonical open reading frames (ncORFs) — regions of the genome that were once thought to be largely irrelevant. Emerging evidence shows that these regions are translated into proteins across various human cell types and disease states, though their full impact on biology has remained unclear due to a lack of large-scale data.
One intriguing class of proteins emerging from ncORFs is microproteins, also known as miniproteins or micropeptides. These are small proteins, typically fewer than 100 amino acids, that are translated from independent small open reading frames (sORFs or smORFs). Once dismissed because of their size, microproteins are now recognized as key players in regulating cellular functions, influencing everything from gene expression to cellular pathways and disease mechanisms. Recent studies are also beginning to reveal their therapeutic potential, especially in fields like cancer, neurodegenerative diseases, and genetic disorders, where their ability to interact with other proteins in unique ways opens up exciting possibilities for new treatments.
A recent preprint adds to the importance of this “dark proteome.” The study found that at least 25% of 7,264 ncORFs across human cells produce translated gene products. By combining techniques like proteomics, immunopeptidomics, and Ribo-seq, the researchers identified over 3,000 peptides from these previously overlooked regions. With data from more than 95,000 experiments, this study offers a comprehensive view of the ncORF landscape. The scale of this analysis underscores the untapped potential of ncORFs and sets the stage for research in a wide range of species, from humans to plants and animals.
Understanding the dark proteome is increasingly vital for deciphering the molecular basis of disease. Many microproteins encoded by ncORFs play regulatory roles that are not yet fully understood, but emerging research suggests they may be implicated in cancer progression, immune evasion, and neurodegenerative disease mechanisms. Because these proteins are often overlooked by traditional gene annotation and detection methods, they represent a blind spot in clinical diagnostics and biomarker discovery. For example, microproteins derived from the dark proteome can modulate signaling pathways or interact with known disease-associated proteins, subtly altering cellular behavior in ways that elude conventional assays.
Detecting proteins from the dark proteome remains a major challenge due to their small size, low abundance, and lack of conserved sequences, which make them elusive to standard mass spectrometry methods. Traditional proteomics often misses these short-lived or unannotated peptides. This is where Next-Gen Protein Sequencing™ (NGPS™) is poised to make a transformative impact. By directly sequencing intact peptides without relying on prior knowledge or abundance thresholds, NGPS bridges the gap between genomic insights and functional protein detection. Its ability to illuminate these hidden protein landscapes enhances our understanding of disease biology and opens the door to discovering entirely new classes of biomarkers and therapeutic targets.
Dr. John Prensner, one of the lead authors on the study, envisions broad applications of NGPS in this area. “We are starting to learn that the human ‘dark proteome’ may be vast,” he says. “We are finding thousands of HLA-bound peptides from these unannotated protein entities. Yet, in the overwhelming majority of cases, we don’t see them by traditional mass spectrometry. I’m excited that Next-Gen Protein Sequencing might offer another avenue to look deeper into this dark proteome in a way that could complement traditional mass spectrometry.” In the years ahead, the study of microproteins, miniproteins, and ncORFs could transform fields like drug design, gene therapy, and molecular biology. As we continue to unravel the hidden complexities of the proteome, these small but mighty molecules could play a crucial role in advancing precision medicine, helping to illuminate the dark proteome and offer novel therapeutic options for diseases that have long evaded effective treatments.

Meredith Carpenter, PhD, Head of Scientific Affairs, Quantum-Si
Meredith L. Carpenter, PhD, is head of scientific affairs at Quantum-Si, where she manages external collaborations and publication strategy. Dr. Carpenter has over 10 years of experience in developing and deploying novel genomics and multi-omics tools in the biotech industry. Prior to Quantum-Si, Dr. Carpenter held roles as director of assay development at Arc Bio and senior director of strategic alliances at Cantata Bio. She earned a BS in Biology from Emory University and a PhD in Molecular and Cell Biology from UC Berkeley, and she performed postdoctoral research in the Department of Genetics at Stanford University.