Max-Planck-Institut für Informatik
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

MPI-INF or MPI-SWS or Local Campus Event Calendar

<< Previous Entry Next Entry >> New Event Entry Edit this Entry Login to DB (to update, delete)
What and Who
Title:Fast methylation calling on mammalian bisulfite sequencing data
Speaker:Jonas Fischer
coming from:Fachrichtung Informatik - Saarbr├╝cken
Speakers Bio:Graduate Student Informatics, UdS
Event Type:PhD Application Talk
Visibility:D1, D2, D3, D4, D5, SWS, RG1, MMCI
We use this to send out email in the morning.
Level:Public Audience
Language:English
Date, Time and Location
Date:Tuesday, 10 October 2017
Time:11:00
Duration:60 Minutes
Location:Saarbr├╝cken
Building:E1 4
Room:024
Abstract
The major advances in Next Generation Sequencing (NGS) approaches improved the accuracy and at the same time drastically reduced the time required for sequencing large genomic libraries. One particular such library type is whole genome bisulfite sequencing (WGBS) to measure DNA methylation, a covalent modification of the DNA. DNA methylation is one of the most studied epigenetic marks and is known to be responsible for developmental changes in the genomic landscape such as X-chromosome inactivation, genomic imprinting, and silencing of pluripotency-associated genes. Furthermore, DNA methylation is associated with neurodegenerative diseases and cancer, showing aberrant methylation patterns in affected cells.

However, where the advances in NGS reduced the time for WGBS library sequencing, the algorithms for analysis are slow and take days for a large data set, a serious bottleneck in current applications. In this talk I will explain the main computational problems arising with the WGBS protocol and abstract it to a string matching problem, and present how the state of the art approaches tackle this problem. Then I will outline the main ideas of our solution to overcome the limitations of the current software. Our approach revolves around a succinct index representation of the reference genome utilizing a fast cyclic rolling hash function tailored for k-mers of genomic sequences. To align bisulfite reads to the reference genome, we use this index to find candidate regions in the genome, which will then be further filtered by several heuristics to drastically reduce the search space. The remaining candidates are then validated by a modified Shift-And automaton, which allows for asymmetric C/T mapping. The overview will be rounded up by a benchmark with sampled and real data to show that our approach is one order of magnitude faster than the competing algorithms while maintaining similar or even better performance.

Contact
Name(s):IMPRS Office Team
Phone:0681 93251800
EMail:--email address not disclosed on the web
Video Broadcast
Video Broadcast:NoTo Location:
Tags, Category, Keywords and additional notes
Note:
Attachments, File(s):

Created by:Aaron Alsancak/MPI-INF, 10/09/2017 01:41 PMLast modified by:Uwe Brahm/MPII/DE, 10/10/2017 07:01 AM
  • Aaron Alsancak, 10/09/2017 01:46 PM -- Created document.