Stay Informed:
Baskin Engineering COVID-19 Information and Resources
Campus Roadmap to Recovery
Zoom Links: Zoom Help | Teaching with Zoom | Zoom Quick Guide

Defense: Identification and Mixture Deconvolution of Ancient and Forensic DNA using Population Genomic Data

Speaker Name: 
Samuel Vohr
Speaker Title: 
PhD Candidate
Speaker Organization: 
Biomolecular Engineering & Bioinformatics
Start Time: 
Monday, December 5, 2016 - 11:00am
End Time: 
Monday, December 5, 2016 - 12:30pm
Biomed 200
Richard Green
Forensic scientists routinely use DNA for identification and to attribute samples to individuals. Although standard approaches are effective on wide variety of samples in various conditions, issues such as low-template DNA samples and mixtures of DNA from multiple individuals pose significant challenges. Extreme examples of these challenges can be found in the field of ancient DNA, where DNA recovered from ancient remains is highly fragmented and marked by patterns of DNA-damage. Additionally, ancient libraries are often characterized by low endogenous DNA content and contaminating DNA from outside sources. As a result, standard forensics approaches, such as amplification of short-tandem repeats, are not effective on ancient samples. Alternatively, ancient DNA is routinely directly sequenced using high-throughput sequencing to survey the molecules that are present within a library. However, the resulting sequences are not easily compared for the purposes of identification, as each data set represents a random sample of the genome.
In this dissertation, I present two approaches for interpreting shotgun sequences that address two common issues in forensic and ancient DNA: extremely low nuclear genome coverage and mixtures of sequences from multiple individuals. First, I present an approach to test for a common source individual between extremely low-coverage sequence data sets that makes use of the vast number of single-nucleotide polymorphisms (SNPs) discovered by surveys of human genetic diversity. As almost no SNP positions will be observed in both samples, our method uses patterns of linkage disequilibrium as modeled by a panel of haplotypes to determine whether observations made across samples are consistent with originating from a single individual. I demonstrate the power of this approach using coalescent simulations, down-sampled high-throughput sequencing data and published ancient DNA data. Second, I present an approach for interpreting mixtures of mitochondrial DNA sequences from multiple individuals. Mixed DNA samples are common in forensics investigations, either from the direct nature of a case (e.g., a sample containing DNA from both a victim and a perpetrator) or from outside contamination. I describe an expectation maximization approach for detecting the mitochondrial haplogroups contributing to a mixture and partitioning fragments by haplogroup to reconstruct the underlying haplotypes. I demonstrate the approach’s feasibility, accuracy, and sensitivity on both in silico and in vitro sequence mixtures. Finally, I present the results of applying our mixture interpretation approach on ancient contact DNA recovered from ~700 year old moccasin and cordage samples.