Capturing Proteus with Next-Generation Protein Sequencing™

By Quantum-Si Marketing Team

Famed for knowing both the past and the future, the lore of Proteus captivated ancient Greek sailors who longed to know more about their destiny. In Proteus—Poseidon’s first born son—sailors saw the promise of wisdom and the ability to avoid ship-destroying tempests. But, collecting information from Proteus was no small task: Stories held that he could change his physical form at will and would only speak to those who could catch him. And so wisdom remained out of reach for all but those who had the tools and tenacity to capture a god [ref].

It is appropriate then that, in 1883, Dutch chemist Gerardus Johannes Mulder used Proteus as inspiration when naming a new and diverse class of biomolecules, one he called proteins [ref, ref]. Like their namesake, proteins have a captivating quality about them that allures researchers with the promise of invaluable information. Proteins represent the last step of life’s central dogma and are responsible for carrying out many of life’s essential functions. Studying the sequence, structure, and function of these complex molecules can greatly advance our understanding of human health and disease, potentially guiding researchers in biomarker identification, discovery of therapeutic targets, and more.

However, like Proteus, extracting information from proteins is a challenging task that requires the right tools. Mass spectrometry (MS) has long been the gold standard for high-resolution protein biology. Though powerfully informative, the technical complexity and substantial resource investment required for MS has greatly reduced its availability to the broader research community, leaving many to rely on overbooked core facilities. And, even the best MS workflows may struggle to differentiate between proteoforms [ref]. Therefore our understanding of protein variation and how it contributes to both health and disease has been significantly limited by a technological bottleneck.

Fortunately, the recent advent of Next-Generation Protein Sequencing™ (NGPS) has the potential to open this bottleneck and unleash a flood of proteomic advancements [ref, ref]. In the following sections, we dive in on NGPS technology and how its arrival is likely to have immediate impacts across the scientific community, with a particular focus on academic and core facilities. Now more than ever, the information locked within a protein’s sequence is in reach, and we may finally have a grasp on this molecular Proteus.

Next-Generation Protein Sequencing

NGPS technology enables researchers to annotate a sequential string of amino acids in much the same way that next-generation DNA sequencing technology helps researchers study strands of nucleic acids [ref, ref]. Briefly, presently available protein sequencing technology works by incubating peptide with reporter-labeled N-terminal amino acid recognizers [ref]. Upon binding, reporter signals are recorded followed by enzymatic cleavage of the N-terminal amino acid. The cycle of binding and cleavage repeats as each new position is exposed, allowing researchers to methodically annotate each amino acid in a peptide chain. For the identification of post-translational modifications, researchers at Quantum Si have devised an elegant system that differentiates chemically similar amino acids based on reporter binding kinetics. (For a more thorough breakdown of single-molecule protein sequencing technology, read our recent publication in Science.)

Notably, there are several important differences between NGPS and MS technology. MS works by breaking peptides into small, ionized fragments whose mass and charge is subsequently recorded. This data is then used to build an estimate of which protein(s) are likely to have produced these fragments. Though capable of differentiating unrelated proteins, resolving those that have a similar weight and charge—such as proteins affected by post-translational modifications—can be beyond the technology’s capabilities [ref].  [ref]. Therefore, NGPS may offer a significant advantage over MS for researchers in need of high-resolution protein identification.

MS is also a costly and experimentally complex methodology that few researchers are properly equipped and trained for. Consequently, many academic laboratories must forgo detailed protein studies or else rely on the services of core MS facilities. In contrast, NGPS can be carried out using a relatively simple workflow and a desktop machine that costs nearly 1/10 the price of an MS instrument. As such, NGPS is poised to substantially improve access to the proteome.

Protein Sequencing in Core facilities and Academic Laboratories

Among those who are likely to see an immediate benefit from NGPS are scientists working in core research facilities and academic laboratories.

Benefits of NGPS for Core Facilities

Core facilities grew to prominence during the genomics revolution when the high demand for DNA sequencing was paralleled only by the high cost of DNA sequencing machines [ref]. Now, core facilities typically offer an assortment of specialized services that range from DNA sequencing to protein analytics.

In this environment, NGPS is likely to expand the depth of the core’s offering by:

  1. Complementing genomic, transcriptomic, and other multi-omics data sets with high resolution protein sequencing. Many genomics cores are unable to dedicate the funding and training needed to build out an MS offering. The affordability and simplicity of NGPS, on the other hand, provides an opportunity for these cores to offer more comprehensive molecular analyses that will allow researchers to follow mutations from DNA to protein.
  2. Supplementing MS data with finer resolution information about a specific protein’s amino acid composition. MS can be a valuable and high-throughput approach to proteomics, but when deeper characterization of a protein is needed, labs have few resources at their disposal. Cores offering NGPS may prove invaluable for their ability to carry out both high-throughput protein analyses with MS, as well as follow up, high-resolution characterization studies using NGPS.

Benefits of NGPS for Academic laboratories

Academic laboratories are likely to see significant benefit from NGPS as well. At present, most laboratories rely on proteomics cores or on low-resolution immunoassay methods to analyze protein dynamics. Immunolabeling, for example, can be a quick and simple approach to protein identification; however, this approach only measures the presence of a specific epitope—it says little about the presence of any variations or modification beyond the antibody binding site. And as alluded to above, relying on core facilities can lead to prolonged timelines, added expenses, and a general reduction in experimental freedom—most researchers want the flexibility to test hypotheses and experimental variables, neither of which is easy to do when waiting for core services.

Capturing Proteus

Sailors sought Proteus in the hopes of learning about the future and their past. Proteins are not nearly as predictive, but delving into protein sequence variation does have the potential to be transformative in molecular and clinical sciences. Realizing this potential will not be easy, but with NGPS, we have a better chance than ever to capture our Proteus and tap into a rich well of proteomic information.

References

  1. PROTEUS – Greek Sea-God, Herdsman of Seals, Old Man of the Sea
  2. Origin of the Word ‘Protein’ | Nature
  3. The Energy Costs of Protein Metabolism: Lean and Mean on Uncle Sam’s Team – The Role of Protein and Amino Acids in Sustaining and Enhancing Performance – NCBI Bookshelf
  4. Comparison of Protein Sequencing Analysis of CDNF, IL6, and FGF2 on Platinum® and Mass Spectrometry
  5. Real-time dynamic single-molecule protein sequencing on an integrated semiconductor device | Science
  6. The emerging landscape of single-molecule protein sequencing technologies | Nature Methods
  7. Challenges for proteomics core facilities
  8. Challenges and Opportunities for Biological Mass Spectrometry Core Facilities in the Developing World – PMC