Assembling Genomes and Finding Disease-Causing Mutations

Description

Carsonella ruddii is a bacterium that lives symbiotically inside some insects. Its sheltered life has allowed it to reduce its genome to only about 160,000 base pairs. With only about 200 genes, it lacks some genes necessary for survival, but these genes are supplied by its insect host. In fact, Carsonella has such a small genome that biologists have conjectured that it is losing its “bacterial” identity and turning into an organelle, which is part of the host’s genome. This transition from bacterium to organelle has happened many times during evolutionary history; in fact, the mitochondrion responsible for energy production in human cells was once a free-roaming bacterium that we assimilated in the distant past. Given a collection of simulated error-free read-pairs, use the paired de Bruijn graph to reconstruct the Carsonella ruddii genome. Compare this assembly to the assembly obtained from the classic de Bruijn graph (i.e., when all we know is the reads themselves and do not know the distance between paired reads) in order to better appreciate the benefits of read-pairs. For each k, what is the minimum value of d needed to enable reconstruction of the entire Carsonella ruddii genome from its (k, d)-mer composition?



More Ways to Learn Genomic Data Science

Introduction to Genomic Technologies

College | Online class

Introduction to Genomic Technologies is course 1 of 8 in the Genomic Data Science Specialization. This specialization covers the concepts and tools to understand, analyze, and interpret data from...

Free

Genomic Data Science Capstone

College | Online class

In this culminating project, you will deploy the tools and techniques that you’ve mastered over the course of the specialization. You’ll work with a real data set to perform analyses and prepare a...

Free

Bioconductor for Genomic Data Science

College | Online class

Learn to use tools from the Bioconductor project to perform analysis of genomic data. This is the fifth course in the Genomic Big Data Specialization from Johns Hopkins University.

Free

Statistics for Genomic Data Science

College | Online class

An introduction to the statistics behind the most popular genomic data science projects. This is the sixth course in the Genomic Big Data Science Specialization from Johns Hopkins University.

Free

Command Line Tools for Genomic Data Science

College | Online class

Introduces to the commands that you need to manage and analyze directories, files, and large sets of genomic data. This is the fourth course in the Genomic Big Data Science Specialization from...

Free

See all resources for Genomic Data Science