• Wellcome Connecting Science

Bioinformatics for Biologists: Analysing and Interpreting Genomics Datasets

Boost your career by gaining the skills to install and modify the elements of a bioinformatics workflow to suit your needs.

15,585 enrolled on this course

Woman wearing a lab coat and sitting in a biology laboratory reads off an iPad in front of a PC screen full of genomics data.
  • Duration

    3 weeks
  • Weekly study

    6 hours
  • 100% online

    How it works
  • Digital upgrade

    Free

Gain practical data science skills

Bioinformatics is crucial in helping to understand large and complex sets of biological data.

On this three-week course, you’ll develop your knowledge of bioinformatics and gain practical experience in scaling up analysis and working with large and complex datasets effectively.

With this knowledge, you’ll have the skills to tackle real-world genomics research challenges, enhancing your career prospects in fields such as genomics medicine, bioinformatics, and data science.

Develop your understanding of DNA sequencing

You’ll start by gaining an introduction to sequencing technologies and current sequencing outputs.

Dive into the rich history of DNA sequencing and how it has evolved in the Next-Generation Sequencing (NGS) era before gaining practical skills in mapping sequencing data to a reference genome.

Delve into workflow management and analysis

Next, you’ll unpack bioinformatics workflows as you learn how to use workflow management software such as Nextflow.

With this knowledge, you’ll gain skills to adapt existing workflows to your specific needs and optimise your analysis processes.

Explore data analysis with R

Finally, you’ll explore downstream analysis with the versatile R programming language. You’ll learn how to work with pipeline outputs in R to conduct in-depth analysis and visualisation.

Guided by the experts at Wellcome Connecting Science and using hands-on exercises, you’ll finish the course with both the knowledge and practical skills to effectively handle and analyse sequence datasets.

Download video: standard or HD

Skip to 0 minutes and 16 seconds Welcome to the Bioinformatics for Biologists– Analysing and Interpreting Genomics Datasets course. We will take you on an exciting bioinformatics journey. The course starts with installing the most widely used bioinformatics tools and an overview of next-generation sequencing and their outputs. You will then master essential commands for quality control, mapping, and variant calling. After that, you will discover the power of workflow managers such as Nextflow to transform your ability to build reproducible bioinformatics pipelines. You’ll also be introduced to pre-existing pipelines and learn how to adapt them to automate your own analysis. This will enable you to explore your own data and run bioinformatics pipelines more efficiently.

Skip to 0 minutes and 59 seconds Finally, you will learn how to perform downstream analysis of pipeline outputs in R by using powerful R packages such as ggplot2 to create informative figures and uncover valuable insights from your results. Join us on this course to learn some transformative ways of analysing genomics data.

Syllabus

  • Week 1

    Introduction to sequence quality control, mapping and variant calling

    • two people collaboratively  working at their laptops

      Introduction to Week 1

      Welcome to the course and to Week 1. How we learn on this course, meeting your educators and general introductions. Introductory steps: glossary, file formats explained, software setup for the course.

    • semi open laptop with blue-ish screen reflecting in the keyboard area

      DNA sequencing, history and present

      Introduction to and history of DNA sequencing, Next Generation Sequencing and key steps of the analysis pipeline explained.

    • fastqc  report on k-mer content

      Sequence quality control

      What is FastQ file and why doing sequence quality control. Running FastQC and interpreting FastQC results.

    • alignment of sample sequence to the reference one

      Mapping

      Mapping of the sample sequence, exercise and discussion

    • open laptop with noise on the screen, all on the black background

      Variant calling

      Identifying and characterising genetic variants, exercise and discussion

    • Wellcome Genome Campus gardens

      Summary of Week 1

      Week 1 learning short summary and Help area.

  • Week 2

    Workflow Management and Analysis

    • laptop half open with bluish screen reflecting on the keyboard part

      Introduction to workflow management

      Introduction to Week 2 with introductory workflow and workflow management systems information. Introducing nf-core pipeline to be used in the course.

    • four black disks staggered to show an arrow on each so representing the flow from first to the last

      Workflow management and analysis

      Installing and using Nextflow workflow manager, running viral recon pipeline and analysing results.

    • Wellcome Genome Campus cafe and garden

      Summary of Week 2

      Tidy up and reflect while watching the interview with Dr Phelelani T. Mpangase, access short summary of the week's learning and help area for the learners.

  • Week 3

    Downstream analysis with R

    • RStudio word written on a laptop screen of grey background

      Introduction to visualisation using RStudio

      Introductory information for week 3, what is covered and what prerequisite knowledge is required. Best practices for setting up your work environment.

    • laptop screen with noise on it all on a black background

      Exploring the data

      Exploration of of the data structures used for later visualisation and analysis, subsetting the data, statistics summary on the data.

    • colour palette available in R

      Data visualisation and analysis

      Data visualisation and plotting transformation using ggplot2, variants exploration, exercise and discussion.

    • Wellcome Genome Campus  gardens

      Summary of Week 3

      Short summary of week's learning and help area, followed by the final test for this course.

When would you like to start?

Start straight away and join a global classroom of learners. If the course hasn’t started yet you’ll see the future date listed below.

  • Available now

Learning on this course

If you'd like to take part while our educators are leading the course, they'll be joining the discussions, in the comments, between these dates:

  • 17 Mar 2025 - 6 Apr 2025

On every step of the course you can meet other learners, share your ideas and join in with active discussions in the comments.

What will you achieve?

By the end of the course, you‘ll be able to...

  • Use software managers to install and run reproducible bioinformatics tools
  • Handle and analyse sequence datasets through hands-on exercises
  • Analyse quality control metrics for sequencing data
  • Modify existing workflows to suit specific task requirements and optimise analysis processes
  • Interrogate and interpret the results obtained from running bioinformatics pipelines
  • Perform downstream analyses of pipeline outputs using R, enabling data visualisation and further exploration

Who is the course for?

This course is designed for those with some prior knowledge of bioinformatics. We recommend starting with our Bioinformatics for Biologists: An Introduction to Linux, Bash Scripting, and R course.

You will need access to a Linux based operating system and some prior knowledge of Linux command line will be needed for you to benefit completely from the learning on this course.

This course will be particularly useful for a genomics researcher, molecular biologist, bioinformatics practitioner, or anyone looking to pursue a career in data science.

Who will you learn with?

Andries van Tonder

I'm a researcher with extensive experience analysing large bacterial genome datasets. My specific research areas include using WGS to study transmission in different bacterial species.

Ruth Nanjala

Ruth is a bioinformatician who’s previously led the Bioinformatics Mentorship and Incubation program at icipe, Kenya. She is also a Carpentries instructor.

Fatma Guerfali

Researcher at Institut Pasteur in Tunis and Trainer in Bioinformatics. Passionate about data analysis and visualization for pathogens related Genomics and Transcriptomics

Who developed the course?

Wellcome Connecting Science

Wellcome Connecting Science

Wellcome Connecting Science develops and delivers open postgraduates courses and conferences focused on biomedicine.

What's included?

Wellcome Connecting Science are offering everyone who joins this course a free digital upgrade, so that you can experience the full benefits of studying online for free. This means that you get:

  • Unlimited access to this course
  • Includes any articles, videos, peer reviews and quizzes
  • Tests to validate your learning
  • A PDF Certificate of Achievement to prove your success when you’re eligible
  • Learning on FutureLearn

    Your learning, your rules

    • Courses are split into weeks, activities, and steps to help you keep track of your learning
    • Learn through a mix of bite-sized videos, long- and short-form articles, audio, and practical activities
    • Stay motivated by using the Progress page to keep track of your step completion and assessment scores

    Join a global classroom

    • Experience the power of social learning, and get inspired by an international network of learners
    • Share ideas with your peers and course educators on every step of the course
    • Join the conversation by reading, @ing, liking, bookmarking, and replying to comments from others

    Map your progress

    • As you work through the course, use notifications and the Progress page to guide your learning
    • Whenever you’re ready, mark each step as complete, you’re in control
    • Complete 90% of course steps and all of the assessments to earn your certificate

    Want to know more about learning on FutureLearn? Using FutureLearn

    Learner reviews

    Do you know someone who'd love this course? Tell them about it...

    You can use the hashtag #FLAdvancedBioinfomatics to talk about this course on social media.