Bioinformatics Skills for Microbial Genomics – 02 February 2022

Bioinformatics rely on a vast number of tools (packages, electronic notebooks, programming languages and their libraries) that bioinformaticians need to be able to install, manage and run. A growing challenge is represented by the organisation of data inputs and outputs – particularly as genomic datasets continue to expand.

This one-day training workshop will introduce key concepts and working modalities that address these challenges, which are rapidly being adopted in the industry, including:

  • Using containers (such as Docker and Singularity) – currently the easiest method for managing and deploying software, easier sharing of code, and higher reproducibility of the pipelines.
  • Workflow languages (Nextflow DSL2) – workflow managers provide a framework for running analyses. They intrinsically provide a degree of data provenance and are easy to re-run analyses with different datasets or parameters in a range of computing environments. 
  • GNU/Linux command-line

Prerequisites 

  • You will need a basic understanding of navigating the GNU/Linux command line. You should be able to use commands such as cd, ls cat, grep. 
  • You will need a basic understanding of microbial genomics.
  • You will need a stable internet connection and a web browser

Outcomes

By the end of the workshop,

  • You will learn how bioinformaticians organise their data and analysis. 
  • You will learn how to deploy bioinformatics software through Linux containers. 
  • You will be introduced to chaining bioinformatics software to run in a “pipeline” via NextFlow.
  • You will be introduced to writing your own workflows using existing NextFlow modules.
  • You will learn how to use these frameworks to run regular bioinformatics analyses such as assembling a microbial genome, creating a phylogenetic tree, and running basic genotyping.

Programme (GMT)

09:00 | Orientation and Testing VMs – participants will be given credentials to access their VMs during this Orientation Session.

10:00 | Welcome – Organising Committee

10:10 | How does a modern bioinformatician organise their work? – (slides) – Nabil-Fareed Alikhan, Quadram Institute Bioscience

10:50 | Getting things done with Conda and Snakemake – Anna Price, Cardiff University

11:30 | The value and use of containers – Anna Price, Cardiff University

12:00 | Lunch Break

13:00 | Practical session 1 – Assemble and examine a microbial genome using containers – Anna Price, Cardiff University

14:30 | Provence and portability through Nextflow – (Slides) – Andrea Telatin, Quadram Institute Bioscience

15:00 | Practical session 2 – Basic bioinformatics using Nextflow – Andrea Telatin & Nabil-Fareed Alikhan, Quadram Institute Bioscience

16:20 | Afternoon Break

16:50 | Working with Nextflow, DSL2 modules and Bactopia – (Slides) – Robert Petit, Wyoming Public Health Laboratory

17:20 | Discussion Panel and Q&A

18:00 | Final Remarks

Other resources

Organising committee