CS700:Graduate Seminar in Computer Science & Informatics

Using diploid method to improve reads mapping and genotype calling based on next generation sequencing data
Shuai Yuan, Department of Mathematics and Computer Science

Next generation sequencing (NGS) technologies have been applied extensively in many areas of genetics and genomics research. A fundamental problem when comes to analyzing NGS data is mapping short sequencing reads back to the reference genome. Most of existing software packages rely on a single uniform reference genome and do not automatically take into consideration of genetic variants. On the other hand, large proportions of incorrectly mapped reads affect the correct interpretation of the NGS experimental results. In this presentation, I will talk about a method that produces a personalized diploid reference genome based on all known genetic variants of that particular individual. Using such a reference genome can improve mapping accuracy and significantly reduce the bias toward reference allele in allele-specific expression analysis. By incorporating the imputation into the diploid method, genotype calling results will also be improved.