Discovery proteomics aims to discover as much of the proteome as possible in a particular cellular context. Bottom up proteomics digests the proteins and analyses the peptides. A botton up experiment involves the extraction of proteins, digestion of protein into peptide, separation by liquid chromatography and analysis by mass spectrometry. The mass spectrometer first detects the peptide masses and intensities. In a data dependent analysis, the most intense peptides are isolated, fragmented and detected. These fragmentation specta allow the sequence of the peptide to be determined through peptide spectral matching. Identifying the peptide sequences allows the inference of the proteins of origin. Using the relative intensity of the peptides it is then possible to determine the relative intensity of the proteins. Compring the relative intensities of proteins between different cellulare states allows inference of processing at work within the cellular states.
Once the sample have been prepared. The peptides are analysed by mass spectrometry. There are a variety of types of mass spectrometers, but their general function is fairly similar. Mass spectrometers consist of three parts; A source, an analyser and a detector. The source is where the peptide are ionised and introduced into the mass spectrometer. The analyser separates peptides based on their mass/charge ratio. The detector detects the analytes once they have been separated. The resultant data is a mass/charge ratio and intensity value for each analyte. In tandem mass spectrometry samples are measured twice. In the first round (MS1) peptide mass/charge and intensities are measured. In the second round high abundant peptides are individually isolated and fragmented by colliding them with inert gas. A spectrum of fragments are then collected. The difference in size of the fragments corresponds to the mass of animo acids. Using peptide spectral matching it is possible to determine the sequence of the peptides and the identity of the proteins they originate from.
Mass spectrometer raw data consists of a list of masses and intensities detected at different times or scans. These need to be matched to know analytes in order to be useful. In bottom up data dependent proteomics, the mass spectra are matched the theoretical peptides from a protein database. A FASTA file is downloaded from a protein database for the organism being studied. The proteins in the FASTA file are digested in-silico using the same enzyme as in the experiment. The mass spectrometry spectra are then matched the theoretical spectra generated for the in-silico generated sequences. The matches are scored. Using a decoy database consisting of random or reverse sequences, decoy matches are also generated. Using the decoy matches a false discovery rate can be determined and used to identify the peptides that are most likely true matches. Each peptide is associated with the intentisty of the MS1 peaks for that peptide.
Proteomics databases which contain protein sequences for peptide spectral matching
Peptide Spectral Matching
Tandem mass spectra matched to theoretical spectra from protein sequence database
Once peptides have been identified and relative protein abundances determined, statistical tests determine which of the protein as significantly deferentially expressed between biological conditions and require further study.
Once a list of significantly deferentially expressed proteins has been determined. These proteins need to be functionally annotated in order to determine what potential roles they may be playing in the biological system. Using over-representation analysis and gene set enrichment, proteins can be assigned to functional units. The classification of these functional units are found within gene ontologies, metabolic pathways and signalling pathway databases.
Bioinformaticians are available to assist you with your project.
The earlier you contact them the more assistance they will be able to offer. In particular, the experimental design is critical in ensuring the success of any project. Contacting a statistician and ensuring your experiment has enough statistical power will go a long way to ensuring its success.
Selecting the best technology for your project will ensure you get the best results for the your project. Omics research is costly, choosing the most appropriate technology for you experiment and budget is therefor critical.
It is best to first run a pilot study and having an expert check the quality of the results before continuing with the bulk of the analysis. The pilot project will also allow you to familiaries yourself with the sample analysis process, the data generated and the means of analysis, before embarking on the main project.
Once you have produced the data, you will realise omics technologies produce mountains of data. It often requires some expertise in handling big data, to deal with the amounts of data produced. Fortunately we have tools and resources to store and process your data making it easy for you to understand. Contact our team of expert bioninformaticians for assistance on all levels of your project.