Bioinformatics rely on a vast number of tools (packages, electronic notebooks, programming languages and their libraries) that bioinformaticians need to be able to install, manage and run. A growing challenge is represented by the organisation of data inputs and outputs – particularly as genomic datasets continue to expand.
This one-day training workshop will introduce key concepts and working modalities that address these challenges, which are rapidly being adopted in the industry, including:
- Using containers (such as Docker and Singularity) – currently the easiest method for managing and deploying software, easier sharing of code, and higher reproducibility of the pipelines.
- Workflow languages (Nextflow DSL2) – workflow managers provide a framework for running analyses. They intrinsically provide a degree of data provenance and are easy to re-run analyses with different datasets or parameters in a range of computing environments.
- GNU/Linux command-line
- You will need a basic understanding of navigating the GNU/Linux command line. You should be able to use commands such as cd, ls cat, grep.
- You will need a basic understanding of microbial genomics.
- You will need a stable internet connection and a web browser
By the end of the workshop,
- You will learn how bioinformaticians organise their data and analysis.
- You will learn how to deploy bioinformatics software through Linux containers.
- You will be introduced to chaining bioinformatics software to run in a “pipeline” via NextFlow.
- You will be introduced to writing your own workflows using existing NextFlow modules.
- You will learn how to use these frameworks to run regular bioinformatics analyses such as assembling a microbial genome, creating a phylogenetic tree, and running basic genotyping.
09:00 | Orientation and Testing VMs – participants will be given credentials to access their VMs during this Orientation Session.
10:00 | Welcome – Organising Committee
10:10 | How does a modern bioinformatician organise their work? – (slides) – Nabil-Fareed Alikhan, Quadram Institute Bioscience
10:50 | Getting things done with Conda and Snakemake – Anna Price, Cardiff University
11:30 | The value and use of containers – Anna Price, Cardiff University
12:00 | Lunch Break
13:00 | Practical session 1 – Assemble and examine a microbial genome using containers – Anna Price, Cardiff University
15:00 | Practical session 2 – Basic bioinformatics using Nextflow – Andrea Telatin & Nabil-Fareed Alikhan, Quadram Institute Bioscience
16:20 | Afternoon Break
17:20 | Discussion Panel and Q&A
18:00 | Final Remarks
- If you haven’t used the shell before: https://swcarpentry.github.io/shell-novice/
- Nextflow and Snakemake head to head comparison: https://github.com/fmaguire/amr_training_workshop_practical