Bioinformatics Skills for Microbial Genomics – 02 February 2022

Bioinformatics rely on a vast number of tools (packages, electronic notebooks, programming languages and their libraries) that bioinformaticians need to be able to install, manage and run. A growing challenge is represented by the organisation of data inputs and outputs – particularly as genomic datasets continue to expand.

This one-day training workshop will introduce key concepts and working modalities that address these challenges, which are rapidly being adopted in the industry, including:

  • Using containers (such as Docker and Singularity) – currently the easiest method for managing and deploying software, easier sharing of code, and higher reproducibility of the pipelines.
  • Workflow languages (Nextflow DSL2) – workflow managers provide a framework for running analyses. They intrinsically provide a degree of data provenance and are easy to re-run analyses with different datasets or parameters in a range of computing environments. 
  • GNU/Linux command-line

Registration

Registration is free but you need to register for a place. A maximum of 70 participants will be allowed, with a preference for CLIMB-BIG-DATA users. If oversubscribed, selection will be necessary and we’ll let the participants know by 26 Jan 2021.

Participants will work in pairs (or small groups) during the practical sessions, on a cloud virtual machine. 

Prerequisites 

  • You will need a basic understanding of navigating the GNU/Linux command line. You should be able to use commands such as cd, ls cat, grep. 
  • You will need a basic understanding of microbial genomics.
  • You will need a stable internet connection and a web browser

Outcomes

By the end of the workshop,

  • You will learn how bioinformaticians organise their data and analysis. 
  • You will learn how to deploy bioinformatics software through Linux containers. 
  • You will be introduced to chaining bioinformatics software to run in a “pipeline” via NextFlow.
  • You will be introduced to writing your own workflows using existing NextFlow modules.
  • You will learn how to use these frameworks to run regular bioinformatics analyses such as assembling a microbial genome, creating a phylogenetic tree, and running basic genotyping.

Programme

Join via Zoom

9:00 |     Orientation and testing virtual machines   Organising committee

10:00 |   Formal welcome        Nabil-Fareed Alikhan, Quadram Institute Biosciences

10:10 |   Lecture: How does a modern bioinformatician organise their work?        Anna Price, Cardiff University

10:50 |   Lecture: Getting things done with Conda and Snakemake   Anna Price, Cardiff University

11:30 |   Lecture: The value and use of containers Anna Price, Cardiff University

12:00 |   Lunch Break

13:00 |   Practical session 1 – Assemble and examine a microbial genome using containers   Anna Price, Cardiff University

14:30 |   Lecture: Provence and portability through Nextflow      Andrea Telatin, Quadram Institute

15:00 |   Practical session 2 – Basic bioinformatics using Nextflow   Andrea Telatin  & Nabil-Fareed Alikhan

16:20 |   Afternoon Break

16:50 |   Lecture: Working with Nextflow, DSL2 modules and Bactopia    Robert Petit, Wyoming Public Health Laboratory

17:20 | Discussion Panel        All

18:00 |   Final Remarks    Organising committee

18:10 | End of workshop