Population Genetics

Blischak, Paul [1], Kubatko, Laura [2], Wolfe, Andrea [3].

Estimating allele frequencies in non-model polyploids using high throughput sequencing data.

Despite the ever increasing opportunity to collect large-scale datasets for population genomic analyses, the use of high throughput sequencing to study populations of polyploids has seen little application. This is due in large part to problems associated with determining allele copy number in the genotypes of polyploid individuals (allelic dosage uncertainty--ADU), which complicates the calculation of important quantities such as allele frequencies. This well known problem has hindered population genetic studies in polyploids for decades, though several tools exist for analyzing genetic data from polyploids by dealing with particular issues of ADU. Additional complications arise because of the mixed inheritance patterns and variable reproductive modes that are characteristic of many polyploid taxa, making the development of population genetic models for polyploids especially difficult. Here we describe a statistical model to estimate biallelic SNP frequencies in a population of polyploids using high throughput sequencing data in the form of read counts. Uncertainty in the number of copies of an allele in an individual's genotype is accounted for by treating genotypes as an intermediate parameter in a hierarchical model. In this way, we bridge the gap from data collection (using techniques such as restriction-site associated DNA sequencing) to allele frequency estimation in a unified inferential framework by summing over genotype uncertainty. Simulated datasets were generated under various conditions for both tetraploid and hexaploid populations to evaluate the model's performance and to help guide the collection of empirical data. We also discuss potential sources of bias that could influence results, as well as propose model extensions to ameliorate some of these biases.

1 - Ohio State University, Evolution, Ecology and Organismal Biology, 456 Aronoff Laboratory, 318 W 12th Avenue, Columbus, OH, 43210, USA
2 - Ohio State University, Statistics, Cockins Hall, 1958 Neil Avenue, Columbus, OH, 43210, USA
3 - Ohio State University, Department Of Ecology, Evolution, And Organismal Biology, 318 W. 12th Avenue, Columbus, OH, 43210-1293, USA

statistical modeling
SNP data
allele frequencies.

Presentation Type: Oral Paper:Papers for Topics
Session: 71
Location: Salon 6/The Shaw Conference Centre
Date: Wednesday, July 29th, 2015
Time: 3:45 PM
Number: 71009
Abstract ID:569
Candidate for Awards:Margaret Menzel Award

