journal article Open Access Jul 17, 2018

Whole-genome amplification in double-digest RADseq results in adequate libraries but fewer sequenced loci

PeerJ Vol. 6 pp. e5089 · PeerJ
View at Publisher Save 10.7717/peerj.5089
Abstract
Whole-genome amplification by multiple displacement amplification (MDA) is a promising technique to enable the use of samples with only limited amount of DNA for the construction of RAD-seq libraries. Previous work has shown that, when the amount of DNA used in the MDA reaction is large, double-digest RAD-seq (ddRAD) libraries prepared with amplified genomic DNA result in data that are indistinguishable from libraries prepared directly from genomic DNA. Based on this observation, here we evaluate the quality of ddRAD libraries prepared from MDA-amplified genomic DNA when the amount of input genomic DNA and the coverage obtained for samples is variable. By simultaneously preparing libraries for five species of weevils (Coleoptera, Curculionidae), we also evaluate the likelihood that potential contaminants will be encountered in the assembled dataset. Overall, our results indicate that MDA may not be able to rescue all samples with small amounts of DNA, but it does produce ddRAD libraries adequate for studies of phylogeography and population genetics even when conditions are not optimal. We find that MDA makes it harder to predict the number of loci that will be obtained for a given sequencing effort, with some samples behaving like traditional libraries and others yielding fewer loci than expected. This seems to be caused both by stochastic and deterministic effects during amplification. Further, the reduction in loci is stronger in libraries with lower amounts of template DNA for the MDA reaction. Even though a few samples exhibit substantial levels of contamination in raw reads, the effect is very small in the final dataset, suggesting that filters imposed during dataset assembly are important in removing contamination. Importantly, samples with strong signs of contamination and biases in heterozygosity were also those with fewer loci shared in the final dataset, suggesting that stringent filtering of samples with significant amounts of missing data is important when assembling data derived from MDA-amplified genomic DNA. Overall, we find that the combination of MDA and ddRAD results in high-quality datasets for population genetics as long as the sequence data is properly filtered during assembly.
Topics

No keywords indexed for this article. Browse by subject →

References
52
[1]
Alexander "Fast model-based estimation of ancestry in unrelated individuals" Genome Research (2009) 10.1101/gr.094052.109.vidual
[2]
Anderson "A new method for non parametric multivariate analysis of variance" Austral Ecology (2001) 10.1111/j.1442-9993.2001.01070.pp.x
[3]
Andrews "Harnessing the power of RADseq for ecological and evolutionary genomics" Nature Reviews Genetics (2016) 10.1038/nrg.2015.28
[4]
Fitting Linear Mixed-Effects Models Using lme4

Douglas Bates, Martin Mächler, Ben Bolker et al.

Journal of Statistical Software 2015 10.18637/jss.v067.i01
[5]
Bhatia "Estimating and interpreting FST: the impact of rare variants" Genome Research (2013) 10.1101/gr.154831.113
[6]
Blair "Assessing the utility of whole genome amplified DNA for next-generation molecular ecology" Molecular Ecology Resources (2015) 10.1111/1755-0998.12376
[7]
Boyle "Polygyny does not explain the superior competitive ability of dominant ant associates in the African ant-plant, Acacia (Vachellia) drepanolobium" Ecology and Evolution (2018) 10.1002/ece3.3752
[8]
Bradburd "Disentangling the effects of geographic and ecological isolation on genetic differentiation" Evolution (2013) 10.1111/evo.12193
[9]
Catchen "Unbroken: RADseq remains a powerful tool for understanding the genetics of adaptation in natural populations" Molecular Ecology Resources (2017) 10.1111/1755-0998.12669
[10]
Chang "Shiny: web application framework for R" (2018)
[11]
Cutler "To pool, or not to pool?" Genetics (2010) 10.1534/genetics.110.121012
[12]
DaCosta "Amplification biases and consistent recovery of loci in a double-digest RAD-seq protocol" PLOS ONE (2014) 10.1371/journal.pone.0106713
[13]
DaCosta "DdRAD-seq phylogenetics based on nucleotide, indel, and presence-absence polymorphisms: analyses of two avian genera with contrasting histories" Molecular Phylogenetics and Evolution (2016) 10.1016/j.ympev.2015.07.026
[14]
Comprehensive human genome amplification using multiple displacement amplification

Frank B. Dean, Seiyu Hosono, Linhua Fang et al.

Proceedings of the National Academy of Sciences 2002 10.1073/pnas.082089499
[15]
Eaton "PyRAD: assembly of de novo RADseq loci for phylogenetic analyses" Bioinformatics (2014) 10.1093/bioinformatics/btu121
[16]
Eaton "ipyrad v. 0.6.8" (2017)
[17]
Eaton "Misconceptions on missing data in RAD-seq phylogenetics with a deep-scale example from flowering plants" Systematic Biology (2016) 10.1093/sysbio/syw092
[18]
Emerson "Resolving postglacial phylogeography using high-throughput sequencing" Proceedings of the National Academy of Sciences of the United States of America (2010) 10.1073/pnas.1006538107
[19]
CD-HIT: accelerated for clustering the next-generation sequencing data

Limin Fu, Beifang Niu, Zhengwei Zhu et al.

Bioinformatics 2012 10.1093/bioinformatics/bts565
[20]
Gautier "Estimation of population allele frequencies from next-generation sequencing data: pool-versus individual-based genotyping" Molecular Ecology (2013) 10.1111/mec.12360
[21]
Graham "Impacts of degraded DNA on restriction enzyme associated DNA sequencing (RADSeq)" Molecular Ecology Resources (2015) 10.1111/1755-0998.12404
[22]
Hosono "Unbiased whole-genome amplification directly from clinical samples" Genome Research (2003) 10.1101/gr.816903
[23]
adegenet: a R package for the multivariate analysis of genetic markers

Thibaut Jombart

Bioinformatics 2008 10.1093/bioinformatics/btn129
[24]
Jombart "adegenet 1.3-1: new tools for the analysis of genome-wide SNP data" Bioinformatics (2011) 10.1093/bioinformatics/btr521
[25]
Discriminant analysis of principal components: a new method for the analysis of genetically structured populations

Thibaut Jombart, Sébastien Devillard, Francois Balloux

BMC Genetics 2010 10.1186/1471-2156-11-94
[26]
lmerTest Package: Tests in Linear Mixed Effects Models

Alexandra Kuznetsova, Per B. Brockhoff, Rune H. B. Christensen

Journal of Statistical Software 2017 10.18637/jss.v082.i13
[27]
Fast gapped-read alignment with Bowtie 2

Ben Langmead, Steven L Salzberg

Nature Methods 2012 10.1038/nmeth.1923
[28]
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences

Weizhong Li, Adam Godzik

Bioinformatics 2006 10.1093/bioinformatics/btl158
[29]
Linck "Evaluating hybridization capture with RAD probes as a tool for museum genomics with historical bird specimens" Ecology and Evolution (2017) 10.1002/ece3.3065
[30]
Lowry "Breaking RAD: an evaluation of the utility of restriction site-associated DNA sequencing for genome scans of adaptation" Molecular Ecology Resources (2017) 10.1111/1755-0998.12635
[31]
Lowry "Responsible RAD: striving for best practices in population genomic studies of adaptation" Molecular Ecology Resources (2017) 10.1111/1755-0998.12677
[32]
Lynch "Population-genetic inference from pooled-sequencing data" Genome Biology and Evolution (2014) 10.1093/gbe/evu085
[33]
Mcardle "Fitting multivariate models to community data: a comment on distance-based redundancy analysis" Ecology (2001) 10.2307/2680104
[34]
McArtor "Extending a distance-based approach to multivariate multiple regression" (2017)
[35]
McArtor "MDMR: multivariate distance matrix regression" (2018)
[36]
McArtor "Extending multivariate distance matrix regression with an effect size measure and the asymptotic null distribution of the test statistic" Psychometrika (2016) 10.1007/s11336-016-9527-8
[37]
McKinney "RADseq provides unprecedented insights into molecular ecology and evolutionary genetics: comment on Breaking RAD by Lowry et al. (2016)" Molecular Ecology Resources (2017) 10.1111/1755-0998.12649
[38]
Ng "Evaluation of 3 methods of whole-genome amplification for subsequent metaphase comparative genomic hybridization" Diagnostic Molecular Pathology (2005) 10.1097/01.pas.0000177801.60121.05
[39]
Oksanen "vegan: community ecology package" (2018)
[40]
Double Digest RADseq: An Inexpensive Method for De Novo SNP Discovery and Genotyping in Model and Non-Model Species

Brant K. Peterson, Jesse N. Weber, Emily H. Kay et al.

PLoS ONE 2012 10.1371/journal.pone.0037135
[41]
fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data Sets

Anil Raj, Matthew Stephens, Jonathan K Pritchard

Genetics 2014 10.1534/genetics.114.164350
[42]
R Core Team (2016)
[43]
Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture

Nadin Rohland, David Reich

Genome Research 2012 10.1101/gr.128124.111
[44]
Rubin "Inferring phylogenies from RAD sequence data" PLOS ONE (2012) 10.1371/journal.pone.0033394
[45]
Sabina "Bias in whole genome amplification: Causes and considerations" (2015) 10.1007/978-1-4939-2990-0_2
[46]
Sequencing pools of individuals — mining genome-wide polymorphism data without big funding

Christian Schlötterer, Raymond Tobler, Robert Kofler et al.

Nature Reviews Genetics 2014 10.1038/nrg3803
[47]
Shortt "Whole genome amplification and reduced-representation genome sequencing of Schistosoma japonicum miracidia" PLOS Neglected Tropical Diseases (2017) 10.1371/journal.pntd.0005292
[48]
Suchan "Hybridization capture using RAD probes (hyRAD), a new tool for performing genomic analyses on collection specimens" PLOS ONE (2016) 10.1371/journal.pone.0151651
[49]
Toonen "ezRAD: a simplified method for genomic genotyping in non-model organisms" PeerJ (2013) 10.7717/peerj.203
[50]
Tripp "RADseq dataset with 90% missing data fully resolves recent radiation of Petalidium (Acanthaceae) in the ultra-arid deserts of Namibia" Ecology and Evolution (2017) 10.1002/ece3.3274

Showing 50 of 52 references

Metrics
43
Citations
52
References
Details
Published
Jul 17, 2018
Vol/Issue
6
Pages
e5089
License
View
Funding
Harvard University William F. Milton Fund and the Harvard University Department of Organismic and Evolutionary Biology Graduate Research Fund
Museum of Comparative Zoology Putnam Expedition Grant and David Rockefeller Center for Latin American Studies Research Travel Grant
Bruno de Medeiros received a Jorge Paulo Lemann Fellowship for Research in Brazil
Cite This Article
Bruno A. S. de Medeiros, Brian D. Farrell (2018). Whole-genome amplification in double-digest RADseq results in adequate libraries but fewer sequenced loci. PeerJ, 6, e5089. https://doi.org/10.7717/peerj.5089
Related

You May Also Like