Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Tytuł pozycji:

The choices we make and the impacts they have: Machine learning and species delimitation in North American box turtles (Terrapene spp.).

Tytuł:
The choices we make and the impacts they have: Machine learning and species delimitation in North American box turtles (Terrapene spp.).
Autorzy:
Martin BT; Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA.
Chafin TK; Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA.
Douglas MR; Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA.
Placyk JS Jr; Department of Biology, University of Texas, Tyler, TX, USA.; Science Division, Trinity Valley Community College, Athens, Texas, USA.
Birkhead RD; Alabama Science in Motion, Auburn University, Auburn, AL, USA.
Phillips CA; Illinois Natural History Survey, Prairie Research Institute, University of Illinois, Champaign, IL, USA.
Douglas ME; Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA.
Źródło:
Molecular ecology resources [Mol Ecol Resour] 2021 Nov; Vol. 21 (8), pp. 2801-2817. Date of Electronic Publication: 2021 Mar 06.
Typ publikacji:
Journal Article
Język:
English
Imprint Name(s):
Original Publication: Oxford, England : Blackwell
MeSH Terms:
Turtles*/genetics
Animals ; Gene Flow ; Machine Learning ; North America ; Phylogeny
References:
Al’Aref, S. J., Anchouche, K., Singh, G., Slomka, P. J., Kolli, K. K., Kumar, A., Pandey, M., Maliakal, G., Van Rosendael, A. R., Beecy, A. N., Berman, D. S., Leipsic, J., Nieman, K., Andreini, D., Pontone, G., Schoepf, U. J., Shaw, L. J., Chang, H.-J., Narula, J., … Min, J. K. (2019). Clinical applications of machine learning in cardiovascular disease and its relevance to cardiac imaging. European Heart Journal, 40, 1975-1986.
Allendorf, F. W., Hohenlohe, P. A., & Luikart, G. (2010). Genomics and the future of conservation genetics. Nature Reviews Genetics, 11, 697-709.
Andrews, S. (2010). FastQC: a quality control tool for high throughput sequence data. https://www.bibsonomy.org/bibtex/2b6052877491828ab53d3449be9b293b3/ozborn.
Arnold, B., Corbett-Detig, R. B., Hartl, D., & Bomblies, K. (2013). RAD seq underestimates diversity and introduces genealogical biases due to nonrandom haplotype sampling. Molecular Ecology, 22, 3179-3190.
Auffenberg, W. (1958). Fossil turtles of the genus Terrapene in Florida. Bulletin of the Florida State Museum, 3, 53-92.
Auffenberg, W. (1959). A Pleistocene Terrapene hibernaculum, with remarks on a second complete box turtle skull from Florida. Quarterly Journal of the Florida Academy of Science, 22, 49-53.
Austerlitz, F., David, O., Schaeffer, B., Bleakley, K., Olteanu, M., Leblois, R., Veuille, M., & Laredo, C. (2009). DNA barcode analysis: a comparison of phylogenetic and statistical classification methods. BMC Bioinformatics, 10, S10.
Avise, J. C. (2000a). Cladists in wonderland. Evolution, 54, 1828-1832.
Avise, J. C. (2000b). Phylogeography: The history and formation of species. Harvard University Press.
Battey, C. J., Coffing, G. C., & Kern, A. D. (2020). Visualizing population structure with variational autoencoders. bioRxiv, 248278.
Becht, E., McInnes, L., Healy, J., Dutertre, C.-A., Kwok, I. W. H., Ng, L. G., Ginhoux, F., & Newell, E. W. (2019). Dimensionality reduction for visualizing single-cell data using UMAP. Nature Biotechnology, 37, 38-44.
Belkina, A. C., Ciccolella, C. O., Anno, R., Halpert, R., Spidlen, J., & Snyder-Cappione, J. E. (2019). Automated optimized parameters for t-distributed stochastic neighbor embedding improve visualization and analysis of large datasets. Nature Communications, 10, 1-12.
Bentley, C. C., & Knight, J. L. (1998). Turtles (Reptilia: Testudines) of the Ardis local fauna late Pleistocene (Rancholabrean) of South Carolina. Brimleyana, 25, 1-33.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5-32.
Brown, W. S. (1971). Morphometrics of Terrapene coahuila (Chelonia, Emydidae), with comments on its evolutionary status. The Southwestern Naturalist, 16, 171-184.
Butler, J. M., Dodd, C. K. Jr, Aresco, M., & Austin, J. D. (2011). Morphological and molecular evidence indicates that the Gulf Coast box turtle (Terrapene carolina major) is not a distinct evolutionary lineage in the Florida Panhandle. Biological Journal of the Linnean Society, 102, 889-901.
Chambers, E. A., & Hillis, D. M. (2019). The multispecies coalescent over-splits species in the case of geographically widespread taxa. Systematic Biology, 69, 184-193.
Chernomor, O., Von Haeseler, A., & Minh, B. Q. (2016). Terrace aware data structure for phylogenomic inference from supermatrices. Systematic Biology, 65, 997-1008.
Chifman, J., & Kubatko, L. (2014). Quartet inference from SNP data under the coalescent model. Bioinformatics, 30, 3317-3324.
Chollet, F. (2015). Keras. https://keras.io.
Das, S., Forer, L., Schönherr, S., Sidore, C., Locke, A. E., Kwong, A., Vrieze, S. I., Chew, E. Y., Levy, S., McGue, M., Schlessinger, D., Stambolian, D., Loh, P.-R., Iacono, W. G., Swaroop, A., Scott, L. J., Cucca, F., Kronenberg, F., Boehnke, M., … Fuchsberger, C. (2016). Next-generation genotype imputation service and methods. Nature Genetics, 48, 1284-1287.
De Queiroz, K. (2007). Species concepts and species delimitation. Systematic Biology, 56, 879-886.
Derkarabetian, S., Castillo, S., Koo, P. K., Ovchinnikov, S., & Hedin, M. (2019). A demonstration of unsupervised machine learning in species delimitation. Molecular Phylogenetics and Evolution, 139, 106562.
Ditmars, R. L. (1934). A review of the box turtles. Zoologica, 17, 1-44.
Dodd, K. C. (2001). North American box turtles. A natural history. University of Oklahoma Press.
Douglas, M. R. E., Douglas, M. R. E., Schuett, G. W., & Porras, L. W. (2006). Evolution of rattlesnakes (Viperidae; Crotalus) in the warm deserts of western North America shaped by Neogene vicariance and Quaternary climate change. Molecular Ecology, 15, 3353-3374.
Durbin, R. (2014). Efficient haplotype matching and storage using the positional Burrows-Wheeler transform (PBWT). Bioinformatics, 30, 1266-1272.
Eaton, D. A. R., & Overcast, I. (2020). ipyrad: Interactive assembly and analysis of RADseq datasets. Bioinformatics, 36, 2592-2594.
Eaton, D. A. R., Spriggs, E. L., Park, B., & Donoghue, M. J. (2017). Misconceptions on missing data in RAD-seq phylogenetics with a deep-scale example from flowering plants. Systematic Biology, 66, 399-412.
Edwards, S. V., Potter, S., Schmitt, C. J., Bragg, J. G., & Moritz, C. (2016). Reticulation, divergence, and the phylogeography-phylogenetics continuum. Proceedings of the National Academy of Sciences of the United States of America, 113, 8025-8032.
Eldredge, N., & Cracraft, J. (1980). Phytigenetic patterns and the evolutinary process: Methods and theory in comparative biology. Columbia University Press.
Ennen, J. R., Matamoros, W. A., Agha, M., Lovich, J. E., Sweat, S. C., & Hoagstrom, C. W. (2017). Hierarchical, quantitative biogeographic provinces for all North American turtles and their contribution to the biogeography of turtles and the continent. Herpetological Monographs, 31, 114-140.
Ernst, C. H., & Lovich, J. E. (2009). Turtles of the United States and Canada (2nd ed.). The John Hopkins University Press.
Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 226-231.
Excoffier, L., Dupanloup, I., Huerta-Sánchez, E., Sousa, V. C., & Foll, M. (2013). Robust demographic inference from genomic and SNP data. PLoS Genetics, 9, e1003905.
Feldman, C. R., & Parham, J. F. (2002). Molecular phylogenetics of emydine turtles: Taxonomic revision and the evolution of shell kinesis. Molecular Phylogenetics and Evolution, 22, 388-398.
Fraley, C., & Raftery, A. E. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal, 41, 578-588.
Francis, R. M. (2017). pophelper: An R package and web app to analyse and visualize population structure. Molecular Ecology Resources, 17, 27-32.
Fritz, U., & Havaš, P. (2013). Order Testudines: 2013 update. In Z.-Q. Zhang (Ed.), Animal Biodiversity: An Outline of Higher-level Classification and Survey of Taxonomic Richness (Addenda 2013) (Vol. 3703, pp. 12-14). Zootaxa.
Fritz, U., & Havaš, P. (2014). On the reclassification of Box Turtles (Terrapene): A response to Martin et al (2014). Zootaxa, 3835, 295-298.
Funk, D. J., & Omland, K. E. (2003). Species-level paraphyly and polyphyly: Frequency, causes, and consequences, with insights from animal mitochondrial DNA. Annual Review of Ecology, Evolution, and Systematics, 34, 397-423.
Gautier, M., Gharbi, K., Cezard, T., Foucaud, J., Kerdelhué, C., Pudlo, P., Cornuet, J.-M., & Estoup, A. (2013). The effect of RAD allele dropout on the estimation of genetic variation within and between populations. Molecular Ecology, 22, 3165-3178.
Goolsby, E. W., Bruggeman, J., & Ané, C. (2017). Rphylopars: Fast multivariate phylogenetic comparative methods for missing data and within-species variation. Methods in Ecology and Evolution, 8, 22-27.
Graham, M. R., Santibáñez-López, C. E., Derkarabetian, S., & Hendrixson, B. E. (2020). Pleistocene persistence and expansion in tarantulas on the Colorado Plateau and the effects of missing data on phylogeographical inferences from RADseq. Molecular Ecology, 29, 3684-3701.
Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q., & Vinh, L. S. (2017). UFBoot2: Improving the ultrafast bootstrap approximation. Molecular Biology and Evolution, 35, 518-522.
Holman, J. A., & Fritz, U. (2005). The box turtle genus Terrapene (Testudines: Emydidae) in the Miocene of the USA. Journal of Herpetology, 15, 81-90.
Howie, B. N., Donnelly, P., & Marchini, J. (2009). A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genetics, 5, e1000529.
Huang, H., & Knowles, L. L. (2016). Unforeseen consequences of excluding missing data from next-generation sequences: Simulation study of RAD sequences. Systematic Biology, 65, 357-365.
Iverson, J. B., Meylan, P. A., & Seidel, M. E. (2017). Testudines-Turtles. In B. I. Crother (Ed.), Scientific and standard English names of amphibians and reptiles of North America North of Mexico, with comments regarding confidence in our understanding (pp. 82-91). SSAR Herpetological Circular 43.
Jakobsson, M., Edge, M. D., & Rosenberg, N. A. (2013). The relationship between FST and the frequency of the most frequent allele. Genetics, 193, 515-528.
Janes, J. K., Miller, J. M., Dupuis, J. R., Malenfant, R. M., Gorrell, J. C., Cullingham, C. I., & Andrew, R. L. (2017). The K = 2 conundrum. Molecular Ecology, 26, 3594-3602.
Jombart, T., & Ahmed, I. (2011). adegenet 1.3-1: New tools for the analysis of genome-wide SNP data. Bioinformatics, 27, 3070-3071.
Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A., & Jermiin, L. S. (2017). ModelFinder: Fast model selection for accurate phylogenetic estimates. Nature Methods, 14, 587-589.
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773-795.
Kaufman, L., & Rousseeuw, P. (1987). Clustering by means of medoids. Statistical Data Analysis Based on the L1-Norm and Related Methods, 405-416.
Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. In: Proceedings of the International Conference on Learning Representations (ICLR). arXiv:1312.6114 [stat.ML].
Kobak, D., & Berens, P. (2019). The art of using t-SNE for single-cell transcriptomics. Nature Communications, 10, 1-14.
Kopelman, N. M., Mayzel, J., Jakobsson, M., Rosenberg, N. A., & Mayrose, I. (2015). CLUMPAK: A program for identifying clustering modes and packaging population structure inferences across K. Molecular Ecology Resources, 15, 1179-1191.
Kruskal, J. B., & Wish, M. (1978). Multidimensional scaling. Sage Publishing.
Lawson, D. J., van Dorp, L., & Falush, D. (2018). A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots. Nature Communications, 9, 3258.
Leaché, A. D., Banbury, B. L., Felsenstein, J., De Oca, A.-N.-M., & Stamatakis, A. (2015). Short tree, long tree, right tree, wrong tree: New acquisition bias corrections for inferring SNP phylogenies. Systematic Biology, 64, 1032-1047.
Leaché, A. D., Fujita, M. K., Minin, V. N., & Bouckaert, R. R. (2014). Species delimitation using genome-wide SNP data. Systematic Biology, 63, 534-542.
Linck, E. B., & Battey, C. J. (2019). Minor allele frequency thresholds strongly affect population structure inference with genomic datasets. Molecular Ecology Resources, 19, 639-647.
Long, C., & Kubatko, L. (2018). The effect of gene flow on coalescent-based species-tree inference. Systematic Biology, 67, 770-785.
Mace, G. M. (2004). The role of taxonomy in species conservation. Philosophical Transactions of the Royal Society B: Biological Sciences, 359, 711-719.
Martin, B. T., Bernstein, N. P., Birkhead, R. D., Koukl, J. F., Mussmann, S. M., & Placyk, J. S. (2013). Sequence-based molecular phylogenetics and phylogeography of the American box turtles (Terrapene spp.) with support from DNA barcoding. Molecular Phylogenetics and Evolution, 68, 119-134.
Martin, B. T., Bernstein, N. P., Birkhead, R. D., Koukl, J. F., Mussmann, S. M., & Placyk, J. S. Jr (2014). On the reclassification of the Terrapene (Testudines: Emydidae): A response to Fritz & Havaš. Zootaxa, 3835, 292-294.
Martin, B. T., Douglas, M. R., Chafin, T. K., Placyk, J. S., Birkhead, R. D., Phillips, C. A., & Douglas, M. E. (2020). Contrasting signatures of introgression in North American box turtle (Terrapene spp.) contact zones. Molecular Ecology, 29, 4186-4202.
Mathieson, I., & McVean, G. (2012). Differential confounding of rare and common variants in spatially structured populations. Nature Genetics, 44, 243-246.
Mayr, E. (1963). Animal species and evolution. Belknap Press at Harvard University Press.
Meck, J. R., Jones, M. T., Willey, L. L., & Mays, J. D. (2020). Autecological study of Gulf Coast box turtles (Terrapene carolina major) in the Florida Panhandle, USA, reveals unique spatial and behavioral characteristics. Herpetological Conservation and Biology, 15, 293-305.
Milstead, W. W. (1967). Fossil box turtles (Terrapene) from central North America, and box turtles of eastern Mexico. Copeia, 1967, 168-179.
Milstead, W. W. (1969). Studies on the evolution of the box turtles (genus Terrapene). Bulletin of the Florida State Museum, Biological Science Series, 14, 1-113.
Milstead, W. W., & Tinkle, D. W. (1967). Terrapene of Western Mexico, with comments on species groups in the genus. Copeia, 1967, 180-187.
Minh, B. Q., Hahn, M. W., & Lanfear, R. (2018). New methods to calculate concordance factors for phylogenomic datasets. bioRxiv, 487801.
Minh, B. Q., Schmidt, H. A., Chernomor, O., Schrempf, D., Woodhams, M. D., Von Haeseler, A., & Lanfear, R. (2020). IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Molecular Biology and Evolution, 37, 1530-1534.
Minx, P. (1992). Variation in phalangeal formulas in the turtle genus Terrapene. Journal of Herpetology, 26, 234-238.
Minx, P. (1996). Phylogenetic relationships among the box turtles, Genus Terrapene. Herpetologica, 52, 584-597.
Molloy, E. K., & Warnow, T. (2018). To include or not to include: the impact of gene filtering on species tree estimation methods. Systematic Biology, 67, 285-303.
Mussmann, S. M., Douglas, M. R., Oakey, D. D., & Douglas, M. E. (2020). Defining relictual biodiversity: Conservation units in speckled dace (Leuciscidae: Rhinichthys osculus) of the Greater Death Valley ecosystem. Ecology and Evolution, 10, 10798-10817.
Nakagawa, S., & Freckleton, R. P. (2008). Missing inaction: The dangers of ignoring missing data. Trends in Ecology & Evolution, 23, 592-596.
Newton, L. G., Starrett, J., Hendrixson, B. E., Derkarabetian, S., & Bond, J. E. (2020). Integrative species delimitation reveals cryptic diversity in the southern Appalachian Antrodiaetus unicolor (Araneae: Antrodiaetidae) species complex. Molecular Ecology, 29, 2269-2287.
Nguyen, L.-T., Schmidt, H. A., von Haeseler, A., & Minh, B. Q. (2015). IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular Biology and Evolution, 32, 268-274.
Nielsen, R., Paul, J. S., Albrechtsen, A., & Song, Y. S. (2011). Genotype and SNP calling from next-generation sequencing data. Nature Reviews Genetics, 12, 443.
Nieuwolt, P. M. (1996). Movement, activity, and microhabitat selection in the western box turtle, Terrapene ornata luteola, in New Mexico. Herpetologica, 487-495.
Nosil, P., & Feder, J. L. (2012). Genomic divergence during speciation: causes and consequences. Philosophical Transactions of the Royal Society B: Biological Sciences, 367, 332-342.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., & Dubourg, V. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825-2830.
Peterson, B. K., Weber, J. N., Kay, E. H., Fisher, H. S., & Hoekstra, H. E. (2012). Double digest RADseq: An inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS One, 7, e37135.
Pickrell, J. K., & Pritchard, J. K. (2012). Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genetics, 8, e1002967.
Plummer, M. V. (2003). Activity and thermal ecology of the box turtle, Terrapene ornata, at its southwestern range limit in Arizona. Chelonian Conservation and Biology, 4, 569-577.
Pritchard, J. K., Stephens, M., & Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics, 155, 945-959.
R Development Core Team. (2018). R: A language and environment for statistical computing. https://cran.r-project.org/.
Rambaut, A., Drummond, A. J., Xie, D., Baele, G., & Suchard, M. A. (2018). Posterior summarization in bayesian phylogenetics using Tracer 1.7 (E Susko, Ed,). Systematic Biology, 67, 901-904.
Rodriguez-Galiano, V. F., Ghimire, B., Rogan, J., Chica-Olmo, M., & Rigol-Sanchez, J. P. (2012). An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS Journal of Photogrammetry and Remote Sensing, 67, 93-104.
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53-65.
Rousset, F. (2008). genepop ’007: A complete re-implementation of the genepop software for Windows and Linux. Molecular Ecology Resources, 8, 103-106.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581-592.
Schrempf, D., Minh, B. Q., De Maio, N., von Haeseler, A., & Kosiol, C. (2016). Reversible polymorphism-aware phylogenetic models and their application to tree inference. Journal of Theoretical Biology, 407, 362-370.
Shepard, R. N., Romney, A. K., & Nerlove, S. B. (1972). Multidimensional scaling: Theory and applications in the behavioral sciences: I. Theory: Seminar Press.
Smith, H. M., & Smith, R. B. (1980). Synopsis of the herpetofauna of Mexico. Volume VI. Guide to Mexican turtles. Bibliographic addendum III. Copeia, 1980(3), 569. https://doi.org/10.2307/1444548.
Smith, M. L., & Carstens, B. C. (2020). Process-based species delimitation leads to identification of more biologically relevant species. Evolution, 74, 216-229.
Smith, M. L., Ruffley, M., Espíndola, A., Tank, D. C., Sullivan, J., & Carstens, B. C. (2017). Demographic model selection using random forests and the site frequency spectrum. Molecular Ecology, 26, 4562-4573.
Soltis, D. E., Morris, A. B., McLachlan, J. S., Manos, P. S., & Soltis, P. S. (2006). Comparative phylogeography of unglaciated eastern North America. Molecular Ecology, 15, 4261-4293.
Spinks, P. Q., & Shaffer, H. B. (2009). Conflicting mitochondrial and nuclear phylogenies for the widely disjunct Emys (Testudines: Emydidae) species complex, and what they tell us about biogeography and hybridization. Systematic Biology, 58, 1-20.
Spinks, P. Q., Thomson, R. C., Lovely, G. A., & Shaffer, H. B. (2009). Assessing what is needed to resolve a molecular phylogeny: Simulations and empirical data from emydid turtles. BMC Evolutionary Biology, 9, 56.
Stephens, P. R., & Wiens, J. J. (2003). Ecological diversification and phylogeny of emydid turtles. Biological Journal of the Linnaean Society, 79, 577-610.
Sukumaran, J., & Knowles, L. L. (2017). Multispecies coalescent delimits structure, not species. Proceedings of the National Academy of Sciences of the United States of America, 114, 1607-1611.
Taylor, W. E. (1895). The box tortoises of North America. Proceedings of the United States National Museum, 17, 573-588.
Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 411-423.
To, T.-H., Jung, M., Lycett, S., & Gascuel, O. (2016). Fast dating using least-squares criteria and algorithms. Systematic Biology, 65, 82-97.
van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579-2605.
Via, S. (2009). Natural selection in action during speciation. Proceedings of the National Academy of Sciences of the United States of America, 106, 9939-9946.
Walker, D. E., & Avise, J. C. (1998). Principles of phylogeography as illustrated by freshwater and terrestrial turtles in the southeastern United States. Annual Review of Ecology and Systematics, 29, 23-58.
Wattenberg, M., Viégas, F., & Johnson, I. (2016). How to Use t-SNE effectively. Distill, 1(10), https://doi.org/10.23915/distill.00002.
Webb, R. G., Minckley, W. L., & Craddock, J. E. (1963). Remarks on the Coahuilan box turtle, Terrapene coahuila (Testudines, Emydidae). The Southwestern Naturalist, 8, 89-99.
Wiens, J. J., Kuczynski, C. A., & Stephens, P. R. (2010). Discordant mitochondrial and nuclear gene phylogenies in emydid turtles: Implications for speciation and conservation. Biological Journal of the Linnaean Society, 99, 445-461.
Yang, Z., & Rannala, B. (2010). Bayesian species delimitation using multilocus sequence data. Proceedings of the National Academy of Sciences of the United States of America, 107, 9264-9269.
Grant Information:
American Turtle Observatory; Endowment: 21st Century Chair in Global Change; TG-BIO160065 NSF-XSEDE Research Allocation; Endowment: Bruker Professorship in Life Sciences; Lucille F. Stickle Fund of the North American Box Turtle Committee; North American Box Turtle Committee
Contributed Indexing:
Keywords: VAE; ddRAD; discordance; filtering; missing data; species tree
Entry Date(s):
Date Created: 20210210 Date Completed: 20211108 Latest Revision: 20211108
Update Code:
20240105
DOI:
10.1111/1755-0998.13350
PMID:
33566450
Czasopismo naukowe
Model-based approaches that attempt to delimit species are hampered by computational limitations as well as the unfortunate tendency by users to disregard algorithmic assumptions. Alternatives are clearly needed, and machine-learning (M-L) is attractive in this regard as it functions without the need to explicitly define a species concept. Unfortunately, its performance will vary according to which (of several) bioinformatic parameters are invoked. Herein, we gauge the effectiveness of M-L-based species-delimitation algorithms by parsing 64 variably-filtered versions of a ddRAD-derived SNP data set collected from North American box turtles (Terrapene spp.). Our filtering strategies included: (i) minor allele frequencies (MAF) of 5%, 3%, 1%, and 0% (= none), and (ii) maximum missing data per-individual/per-population at 25%, 50%, 75%, and 100% (= no filtering). We found that species-delimitation via unsupervised M-L impacted the signal-to-noise ratio in our data, as well as the discordance among resolved clades. The latter may also reflect biogeographic history, gene flow, incomplete lineage sorting, or combinations thereof (as corroborated from previously observed patterns of differential introgression). Our results substantiate M-L as a viable species-delimitation method, but also demonstrate how commonly observed patterns of phylogenetic discordance can seriously impact M-L-classification.
(© 2021 John Wiley & Sons Ltd.)

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies