Bioinformatics Portal

Bioinformatics is the analysis and functional annotation of biological datasets. As the omics fields have developed, they have generated larger and larger quantities of data. This big data cannot be analysed, processed and viewed as text files or even in spreadsheets. To address the complexity and size of data being generated by omics research, the field of bioinformatics emerged. Researchers have developed tools to managing, analysing omics data. These tools are often open source and freely available but have also been bundled into commercial software packaged. While it is possible to learn how to analyse you own data, the process may be very challenging, and it is often best to recruit a dedicated bioinformatician to assist with data management. This portal will undertake to catalogue bioinformatics tool, resources and experts avaialble locally to assist with solving data analysis.

Workflow Software

Bioinformatics analysis expecially of NGS data may require a number of step involving different software processing step, conversion of file formats to reach a desired result. In order to prevent needing to re-invent the wheel everytime a analyis needs to be performed, data analysis workflows have been constructed using recognised and accepted software. Furthermore workflow software has been developed to make it easier to construct workflows from existing software


"Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting languages"


"Galaxy is an open source, web-based platform for data intensive biomedical research"



Tool Collections

One peptides have been identified and relative protein abundances determined, statistical tests determine which of the protein as significantly differentially expressed between biological conditions and require further study.

Seattle Proteome Center Proteomics Tools

"The Seattle Proteome Center (SPC) is committed to providing free, open-source, software projects in support of cutting-edge proteomics research."

EBI Tools

"The European Bioinformatics Institute (EMBL-EBI) maintains the world’s most comprehensive range of freely available and up-to-date molecular data resources."


"MaxQuant is a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets. It is specifically aimed at high-resolution MS data."


"ProteoWizard provides a set of open-source, cross-platform software libraries and tools (e.g. msconvert, Skyline, IDPicker, SeeMS) that facilitate proteomics data analysis"

Bio Tools

"Essential scientific and technical information about software tools, databases and services for bioinformatics and the life sciences."


"H3ABioNet has developed a number of tools and services for H3Africa and the broader bioinformatics community"

Broad Institure

"Broad Institute of MIT and Harvard, collection of genomics tools"

High-Performance Computer Clusters

Due to the large volumes and complexity of omics data, it is not always possible to analyse this data on standard desktop computers. Large computer clusters of often required, which are able to store large volumes of data and process data rapidly in parallel. There are a number of local HPC cluster available through academic institutions.


"KRISP operates high-productivity computational and storage resources for life and biomedical sciences and maintain high expertise over computational infrastructure, software development, biological data analysis and web development"


"A consortium of universities and research organisations have established a data-intensive research cloud, Ilifu – which means cloud in isiXhosa – and are inviting researchers in these two strategic science domains to start using the infrastructure"


"CHPC is mandated to provide high performance computing (HPC) resources and domain specific support to both public and private sector users"

Public cloud computing services are available, where compute resources can be requested on demand.

Microsoft Azure

"Azure is a complete cloud platform that can host your existing applications and streamline new application development."

Amazon AWS

"Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform, offering over 175 fully featured services from data centers globally."

Google Cloud

"Cloud infrastructure solutions for SAP, VMware, Windows, Oracle, data center migration, and other enterprise workloads"


Bioinformatics Training

Picking up the Tabb

"David Tabb teaches a fairly broad range of coursework. This catalog of topics may be useful for student to get the training they need.

Training eSupport System

"Goals of ELIXIR is to train research scientists to better use available computational infrastructures to address critical research questions"

H3ABioNet Training

"H3ABioNet delivers high quality bioinformatics training in a variety of formats"



Bioinformaticians are available to assist you with your project.


The earlier you contact them the more assistance they will be able to offer. In particular, the experimental design is critical in ensuring the success of any project. Contacting a statistician and ensuring your experiment has enough statistical power will go a long way to ensuring its success.


Selecting the best technology for your project will ensure you get the best results for the your project. Omics research is costly, choosing the most appropriate technology for you experiment and budget is therefor critical.


It is best to first run a pilot study and having an expert check the quality of the results before continuing with the bulk of the analysis. The pilot project will also allow you to familiaries yourself with the sample analysis process, the data generated and the means of analysis, before embarking on the main project.


Once you have produced the data, you will realise omics technologies produce mountains of data. It often requires some expertise in handling big data, to deal with the amounts of data produced. Fortunately we have tools and resources to store and process your data making it easy for you to understand. Contact our team of expert bioninformaticians for assistance on all levels of your project.

Dr Katie Lennard

Bioinformatician at the Institute of Infectious Diseases & Molecular Medicine


  • Genomics

  • Transcriptomics

  • Differential Abundance Statistics



  • 16S rRNA gene amplicon sequencing

  • WGS metagenomics sequencing

  • RNAseq

  • Pathogen isolate profiling


  • Multivariate analyses: PCA, NMDS, MDS, PERMANOVA, PLSDA, RDA

  • Machine learning techniques; Random forests

  • Statistical tools for differential abundance testing

  • Nextflow

  • edgeR

  • metagenomeSeq (R)

Dr Shaun Garnett

Post-Doctoral Fellow at University of Cape Town


  • Transcriptomics

  • Proteomics

  • Differential Abundance Statistics



  • Liquid Chromatorgaphy

  • Mass Spectrometry

  • Discovery Proteomics

  • Statistics

  • Expression Data Functional Annotation


  • MaxQuant

  • Skyline

  • TPP

  • clusterProfiler

  • topGO

  • STRINGdb

  • ReactomePA


©2018 by SA-DIPLOMICS. Proudly created with