Tips to deliver trainings on CLIMB-BIG-DATA Jupyter Notebooks

Here are some key tips and useful information for trainers delivering workshops on CLIMB-BIG-DATA Jupyter Notebooks:

Registration and Access

  • The CLIMB-BIG-DATA team will create a dedicated a new group on Bryn for the lead trainers. The lead trainers will then invite trainees to join the group. This keeps research data separate from training data.
  • Each trainee gets their own Jupyter Notebook server when launched. 
  • Notebooks auto-shutdown after 24 hours of inactivity. Any running jobs are interrupted after 24 hours of inactivity, but the environment and data are preserved (so the environment and the data can be found, intact, when a notebook is restarted).

Storage

  • There is a small 20GB workspace on the notebook itself. Do not use this space to store data.
  • Each group has a 1TB shared team space for collaboration and sharing files. It is good practice to use this space as a scratch space, temporary storage for data/code/environments. Do not use this space to store data long-term.
  • There is also a shared public space with databases maintained by the CLIMB-BIG-DATA team. This space is read-only.
  • Datasets can (and should) be stored in S3 object storage buckets.

Software and Environments

  • Conda is available, but the base environment is read-only. Trainees need to create named environments to install packages.
  • Environments can be pre-created by trainers in the shared team space for easy access. These pre-created environments are ready to use by trainees.
  • Nextflow pipelines run in separate containers with access to more resources than the notebook itself.

CLIMB-BIG-DATA Notebooks – not just Jupyter: Python, RStudio, Linux Terminal, File Uploader and Downloader

  • The CLIMB-BIG-DATA Jupyter Notebook is not a simple Python Notebook. There’s more in there.
  • There is the terminal, which is immediately accessible without SSH keys or passwords. Yes, full Linux Terminal.
  • On Python Notebooks, magic commands starting with “!” run bash commands.
  • The text editor can be used to edit files without using command-line editors. 
  • Files like images and CSVs have built-in viewers.
  • Notebooks can be exported to PDF or Markdown for course materials.

Other Tips

  • Test tutorials thoroughly before delivery.
  • Create a backup of the material before starting the workshop. Remember that the shared team storage can be accessed and used by all the students.
  • Consider how to structure tutorials for effective learning.
  • Use notebook sharing to help with troubleshooting.
  • Ask for feedback to improve tutorials over time.