Quality Control Procedures Required for the Generation of Forensic Quality Mitogenome Reference Data
Kimberly Sturk-Andreaggi* & Charla Marshall | Armed Forces Medical Examiner System – Armed Forces DNA Identification Laboratory (AFMES-AFDIL)/SNA International
Walther Parson | Innsbruck Medical University
Abstract:
Mitochondrial DNA (mtDNA) analysis plays a specialized role in forensic applications, overcoming certain limitations of autosomal DNA markers. The high copy number and uniparental inheritance pattern are advantageous characteristics of mtDNA in cases involving shed hairs and aged skeletal elements, such as decades-old missing persons cases. Though the discriminatory power of mtDNA is limited by common haplotypes, next generation sequencing (NGS) offers feasible access to entire mitochondrial genome (mitogenome) data that can provide increased resolution of common haplotypes to unique sequences. The primary implementation challenge of mitogenome analysis is a lack of forensic-quality reference data, which are required to determine the evidentiary weight of a match. To meet this need, the Armed Forces Medical Examiner System – Armed Forces DNA Identification Laboratory (AFMES-AFDIL) proposed to generate 5,000 mitogenomes as part of a National Institute of Justice (NIJ)-funded project. Mitogenome data were produced using robust laboratory procedures and automated processing, followed by independent data reviews by two experienced analysts and a multi-step quality control (QC) process. During review of the data, analysts assessed haplotypes for misalignment of homopolymer regions, nuclear mtDNA segment (NUMT) interference, sequencing errors, and other artifacts. The mtDNA haplogroup, which was predicted as part of the NGS analysis workflow, provided the analyst with phylogenetic nomenclature guidance and an invaluable QC check of the haplotype during data review. Replicate processing of samples with questionable variants, often with an alternate enrichment method, was performed to confirm the authenticity of the mtDNA haplotype. Once data review was completed for a population, samples with shared mitogenome haplotypes were subjected to nuclear DNA testing to assess their relatedness (i.e., nuclear family members and potential second-degree relatives). If maternal relatives were identified, only one sample from the lineage was included to ensure that the mitogenome data represented a random sampling of the population. Lastly, a series of QC checks was performed during submission to the EMPOP database. In addition to the confirmation of the haplogroup prediction, haplotypes were compared against a curated mitogenome dataset to identify any “abnormal” variants or combination of variants (e.g., variants never observed before, known phantom mutations, or other haplotype irregularities). Haplotypes that did not pass the EMPOP QC checks were flagged for further review, resulting in the reanalysis of the existing NGS data or replicate processing to confirm the reported variants. Haplotypes that ultimately could not meet the QC criteria were excluded from reference dataset. In the end, the AFMES-AFDIL produced over 6,000 forensic-quality mitogenome haplotypes for use by the forensic community.