Next Generation Sequencing

There are a variety of Next Generation Sequencing techniques being routinely used in the lab. They all sequence small pieces of DNA in parallel. These small reads are then assembled according to the sequence of a reference genome using bioinformatics tools. Every area of the genome is sequenced multiple times described as the depth of coverage, ensuring the reliability of the data. With good quality coverage it is possible to individual nucleotides in the sequences that differ from the reference genome. These variances are identified and annotated. The whole genome can be sequences or just areas of interest like the exomes. Information from sequencing can be used to identify microbes or variances causing inherited disorders or cancers. Identifying the variances aids in the development of treatments for genetics disorders.


Thermo Fisher IonProton

ChIP sequencing
Exome sequencing
Gene expression sequencing
De novo sequencing
Small RNA sequencing
Whole transcriptome sequencing

illumina NextSeq

"The NextSeq 550 System brings the power of a high-throughput sequencing system to your benchtop. With tunable output and high data quality, it provides the flexible power you need for whole-genome, transcriptome, and targeted resequencing plus the ability to scan microarrays."

Thermo Fisher Ion S5

The Ion S5™ next-generation sequencing system enables a simple targeted sequencing workflow

illumina MiSeq

Small genome sequencing provides comprehensive analysis of microbial or viral genomes for public health, epidemiology, and disease studies. Sequence up to 24 small genomes per MiSeq run

illumina HiSeq

The HiSeq 2500 System is a powerful high-throughput sequencing system. High-quality data using proven Illumina SBS chemistry has made it the instrument of choice for major genome centers and research institutions throughout the world.

Data Analysis

NGS data is produced as either FASTQ of CSFASTA. These files contain sequence reads with quality values associated with each base. The sequences need to be aligned and mapped to a reference genome, if available or assembled de novo. Modern algorithm are faster than the traditional sequence alignment algorithms. Once aligned an assement needs to be made of the quality of the reads and the depth of coverage. If the quality of the data is sufficient then variants that differ from the refernce genome may be identified. The variants are then annotated based on existing knowlege and visualised using genome browsers.

Short-read aligners

Align reads to the reference genome to determine their position



Visualisation of genomics data using genome browsers

Quality Assesment

Check the quality of the NGS reads and alignment

Variant Identification

Identify areas of newly sequenced genome that differ from the reference genome

Variant Annotation

Annotation of variants that differ from the reference genome.


Next Generation Sequencing analysis consists of multiple steps, in an effort to ensure that data is processed in a consistent manner H3A BioNet has compiled a number of workflows.

Variant calling

Outlines the essential steps in calling short germline variants, and recommends tools that have gained community acceptance for this purpose

16s rDNA diversity analysis

16S rDNA diversity analysis of bacteria and archaea enables their identification and determination of their relative abundance

Genome Wide Association Studies

This the key workflow of the H3Africa deigned for bioinformaticians doing GWAS

Genome Analysis Toolkit

The primary focus of the toolkit is variant discovery and genotyping.

Bioinformaticians are available to assist you with your project.


The earlier you contact them the more assistance they will be able to offer. In particular, the experimental design is critical in ensuring the success of any project. Contacting a statistician and ensuring your experiment has enough statistical power will go a long way to ensuring its success.


Selecting the best technology for your project will ensure you get the best results for the your project. Omics research is costly, choosing the most appropriate technology for you experiment and budget is therefor critical.


It is best to first run a pilot study and having an expert check the quality of the results before continuing with the bulk of the analysis. The pilot project will also allow you to familiaries yourself with the sample analysis process, the data generated and the means of analysis, before embarking on the main project.


Once you have produced the data, you will realise omics technologies produce mountains of data. It often requires some expertise in handling big data, to deal with the amounts of data produced. Fortunately we have tools and resources to store and process your data making it easy for you to understand. Contact our team of expert bioninformaticians for assistance on all levels of your project.

Dr Katie Lennard

Bioinformatician at the Institute of Infectious Diseases & Molecular Medicine


  • Genomics

  • Transcriptomics

  • Differential Abundance Statistics



  • 16S rRNA gene amplicon sequencing

  • WGS metagenomics sequencing

  • RNAseq

  • Pathogen isolate profiling


  • Multivariate analyses: PCA, NMDS, MDS, PERMANOVA, PLSDA, RDA

  • Machine learning techniques; Random forests

  • Statistical tools for differential abundance testing

  • Nextflow

  • edgeR

  • metagenomeSeq (R)

©2018 by SA-DIPLOMICS. Proudly created with