Discovery Proteomics

In discovery proteomics, the most abundant proteins are indiscriminately detected and quantitated by LC-MSMS. Mostly bottom-up proteomics approaches are used. Here proteins are extracted from a sample. The proteins are then digested into peptides. The peptides are then separated by liquid chromatography and analysed by mass spectrometry. The mass spectrometer detects peptide masses/charge ratios and measures their intensities.

In Data-Dependent Analysis (DDA), the most abundant peptides are selected for fragmentation. The fragmentation spectra are then detected and used for peptide spectral matching (PSM). Peptides are used to infer protein identity and abundance. Only the peptides selected for fragmentation can be identified and used to identify proteins.

In Data-Independent Analysis (DIA), collections of peptides are selected for fragmentation in defined mass ranges. Using a spectral library, selected peptide fragments are extracted from the fragmentation spectra. These fragments are used to determine peptide abundance, which is translated to proteins abundance. If later additional information is available the spectra may be searched again for additional proteins.

Relative protein abundance between sample groups is then functionally annotated to determine the mechanism responsible for the differences observed in the biological systems.


Sample Preparation

Protein Extraction

Extract and enrich proteins from biological samples


Desalt and clean up peptide sample before LC-MS

Protein Quantiation

Determine the quantity of protein in a sample

Sample Fractionation

Fractionate samples before mass spectrometry analysis to increase the number of proteins identified

Protein Digestion

Digest proteins into peptide for detection by mass spectrometry

Liquid Chromatography

Liquid chromatography separation coupled to mass spectrometry, separating sample over time to get the most out of the analysis.

Isotopic Labelling

Label samples with isotopic or isobaric labels, thereby enabling multiplexing of samples and increased accuracy of relative abundance comparisons.

Mass Spectrometry

Detect the mass/charge ratio and abundance of ionised analytes.


Data Generation

Mass spectrometers consist of three parts; A source, an analyser and a detector. Peptides are ionised and introduced into the mass spectrometer via the source. The mass analyser separates peptides based on their mass/charge ratio, and the detector then measures them. The resultant data is a mass/charge ratio and intensity value for each analyte.


A diversity of state of the art mass spectrometers are available locally. Different configuration of sources, analysers and detectors enable slightly different analysis methodologies. The local machines were selected for specific functions, to address a diversity of local needs. It is important to consult with an expert when choosing the mass spectrometer to be used for a specific project.


Thermo Fisher LTQ Velos

Combines the proven mass accuracyand ultra-high resolution of the Orbitrapmass analyzer, with the increased sensitivi-ty and improved cycle time of the LTQ Velos

Thermo Fisher Orbitrap Fusion

Orbitrap Fusion™ Tribrid™ Mass Spectrometer. This instrument combines the best of quadrupole, ion trap and Orbitrap mass analysis in a revolutionary Tribrid architecture to provide unprecedented depth of analysis and ease of use.

Thermo Fisher Q Exactive

Hybrid Quadrupole-Orbitrap Mass Spectrometer. This benchtop LC-MS/MS system combines quadruple precursor ion selection with high-resolution, accurate-mass (HRAM) Orbitrap detection to deliver exceptional performance and versatility



Proteomics databases which contain protein sequences for peptide spectral matching


Uniprot proteomes contains protein sequences from organisms that have had their genomes sequences. Some of these protein sequences may be manually curated while other automatically annotated.

Joint Genome Institute

Genomes of microbes, fungi algae and plants


Ensembl is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation.


NCBI Reference Sequence Database:
"A comprehensive, integrated, non-redundant, well-annotated set of reference sequences including genomic, transcript, and protein."

Peptide Spectral Matching

Tandem mass spectra matched to theoretical spectra from protein sequence database


"Progenesis QI for proteomics is discovery analysis software for your LC-MS data; a revolutionary ‘difference engine’ that works in a unique way to help you to answer your biological question."


Peaks Studio

" PEAKS Studio is a software platform with complete solutions for discovery proteomics, including protein identification and quantification, analysis of post-translational modifications (PTMs) and sequence variants (mutations), and peptide/protein de novo sequencing."


"SearchGUI is a highly adaptable open-source common interface for configuring and running proteomics search and de novo engines, currently supporting X! Tandem, MS-GF+, MS Amanda, MyriMatch, Comet, Tide, Andromeda, OMSSA, Novor and DirecTag."


"Byonic™ is our full MS/MS search engine providing unequalled sensitivity for comprehensive peptide and protein identification. Byonic™ results can be input into Byologic™ and/or Byomap™ along with the raw mass spec data and any HPLC data."



"The Crux mass spectrometry analysis toolkit is an open source project that aims to provide users with a cross-platform suite of analysis tools for interpreting protein mass spectrometry data"


"MaxQuant is a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets. It is specifically aimed at high-resolution MS data."


"ProteinPilot™ Software is used for protein identification and relative protein expression analysis for protein research. ProteinPilot Software is compatible with all proteomics MS/MS systems via the.*mgf format."


Trans-Proteomic Pipeline

"The Trans-Proteomic Pipeline (TPP) is a collection of integrated tools for MS/MS proteomics"

Quantitation Analysis

Once peptides have been identified and relative protein abundances determined, statistical tests determine which of the protein as significantly deferentially expressed between biological conditions and require further study.



" Perseus contains a comprehensive portfolio of statistical tools for high-dimensional omics data analysis covering normalization, pattern recognition, time-series analysis, cross-omics comparisons and multiple-hypothesis testing"



"Visualize and validate complex MS/MS proteomics experiments"




"Progenesis QI for proteomics is discovery analysis software for your LC-MS data; a revolutionary ‘difference engine’ that works in a unique way to help you to answer your biological question."

Functional Annotation

Once a list of significantly deferentially expressed proteins has been determined. These proteins need to be functionally annotated in order to determine what potential roles they may be playing in the biological system. Using over-representation analysis and gene set enrichment, proteins can be assigned to functional units. The classification of these functional units are found within gene ontologies, metabolic pathways and signalling pathway databases.


"statistical analysis and visualization of functional profiles for genes and gene clusters"


"REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database."


"STRING is a database of known and predicted protein-protein interactions."


"topGO package provides tools for testing GO terms while accounting for the topology of the GO graph."


"KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information."


"Provide a rapid means to reduce large lists of genes into functionally related groups of genes to help unravel the biological content captured by high throughput technologies."


Bioinformaticians are available to assist you with your project.


The earlier you contact them the more assistance they will be able to offer. In particular, the experimental design is critical in ensuring the success of any project. Contacting a statistician and ensuring your experiment has enough statistical power will go a long way to ensuring its success.


Selecting the best technology for your project will ensure you get the best results for the your project. Omics research is costly, choosing the most appropriate technology for you experiment and budget is therefor critical.


It is best to first run a pilot study and having an expert check the quality of the results before continuing with the bulk of the analysis. The pilot project will also allow you to familiaries yourself with the sample analysis process, the data generated and the means of analysis, before embarking on the main project.


Once you have produced the data, you will realise omics technologies produce mountains of data. It often requires some expertise in handling big data, to deal with the amounts of data produced. Fortunately we have tools and resources to store and process your data making it easy for you to understand. Contact our team of expert bioninformaticians for assistance on all levels of your project.

Dr Shaun Garnett

Post-Doctoral Fellow at University of Cape Town


  • Transcriptomics

  • Proteomics

  • Differential Abundance Statistics



  • Liquid Chromatorgaphy

  • Mass Spectrometry

  • Discovery Proteomics

  • Statistics

  • Expression Data Functional Annotation


  • MaxQuant

  • Skyline

  • TPP

  • clusterProfiler

  • topGO

  • STRINGdb

  • ReactomePA


©2018 by SA-DIPLOMICS. Proudly created with