Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Tytuł pozycji:

Sample size determination for training set optimization in genomic prediction.

Tytuł:
Sample size determination for training set optimization in genomic prediction.
Autorzy:
Wu PY; Department of Agronomy, National Taiwan University, Taipei, Taiwan.; Institute for Quantitative Genetics and Genomics of Plants, Heinrich Heine University, Düsseldorf, Germany.
Ou JH; Department of Agronomy, National Taiwan University, Taipei, Taiwan.; Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.
Liao CT; Department of Agronomy, National Taiwan University, Taipei, Taiwan. .
Źródło:
TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik [Theor Appl Genet] 2023 Mar 13; Vol. 136 (3), pp. 57. Date of Electronic Publication: 2023 Mar 13.
Typ publikacji:
Journal Article
Język:
English
Imprint Name(s):
Original Publication: Berlin, New York, Springer
MeSH Terms:
Models, Genetic*
Quantitative Trait Loci*
Animals ; Sample Size ; Plant Breeding/methods ; Genotype ; Phenotype ; Genomics/methods
References:
Akdemir D, Isidro-Sánchez J (2019) Design of training populations for selective phenotyping in genomic prediction. Sci Rep 9:1–15. (PMID: 10.1038/s41598-018-38081-6)
Akdemir D, Sanchez JI, Jannink JL (2015) Optimization of genomic selection training populations with a genetic algorithm. Genet Sel Evol 47:1–10. (PMID: 10.1186/s12711-015-0116-6)
Chung PY, Liao CT (2020) Identification of superior parental lines for biparental crossing via genomic prediction. PLoS ONE 15:e0243159. (PMID: 10.1371/journal.pone.0243159332707067714229)
Endelman JB (2011) Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Gen 4:250–255. (PMID: 10.3835/plantgenome2011.08.0024)
Forni S, Aguilar I, Misztal I (2011) Different genomic relationship matrices for single-step analysis using phenotypic, pedigree and genomic information. Genet Sel Evol 43:1. (PMID: 10.1186/1297-9686-43-1212084453022661)
Heffner EL, Lorenz AJ, Jannink JL, Sorrells ME (2010) Plant breeding with genomic selection: gain per unit time and cost. Crop Sci 50:1681–1690. (PMID: 10.2135/cropsci2009.11.0662)
Henderson CR (1975) Best linear unbiased estimation and prediction under a selection model. Biometrics 31:423–447. (PMID: 10.2307/25294301174616)
Henderson CR (1977) Best linear unbiased prediction of breeding values not in the model for records. J Dairy Sci 60:783–787. (PMID: 10.3168/jds.S0022-0302(77)83935-0)
Heslot N, Feoktistov V (2020) Optimization of selective phenotyping and population design for genomic selection. JABES 25:601–616. (PMID: 10.1007/s13253-020-00415-1)
Isidro J, Jannink J-L, Akdemir D, Poland J, Heslot N, Sorrells ME (2015) Training set optimization under population structure in genomic selection. Theor Appl Genet 128:145–158. (PMID: 10.1007/s00122-014-2418-425367380)
Isidro y Sánchez J, Akdemir D (2021) Training set optimization for sparse phenotyping in genomic selection: a conceptual overview. Front Plant Sci 12:715910. (PMID: 10.3389/fpls.2021.715910345890998475495)
Kawabata O, DeFrank J (1994) A flexible function for regressing asymptotically declining responses of plant growth to growth retardants. HortScience 29:1357–1359. (PMID: 10.21273/HORTSCI.29.11.1357)
Laloë D (1993) Precision and information in linear models of genetic evaluation. Genet Sel Evol 25:1–20. (PMID: 10.1186/1297-9686-25-6-557)
Laloë D, Phocas F, Menissier F (1996) Considerations on measures of precision and connectedness in mixed linear models of genetic evaluation. Genet Sel Evol 28:359–378. (PMID: 10.1186/1297-9686-28-4-3592708274)
Lenth RV (2001) Some practical guidelines for effective sample size determination. Am Stat 55:187–193. (PMID: 10.1198/000313001317098149)
Lorenz A, Smith KP (2015) Adding genetically distant individuals to training populations reduces genomic prediction accuracy in barley. Crop Sci 55:2657–2667. (PMID: 10.2135/cropsci2014.12.0827)
Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829. (PMID: 10.1093/genetics/157.4.1819112907331461589)
Montgomery DC, Peck EA (1982) Introduction to linear regression analysis. Wiley, New York.
Ou JH (2022) TSDFGS: Training set determination for genomic selection. R package version 2.0. Available online at https://cran.r-project.org/package=TSDFGS .
Ou JH, Liao CT (2019) Training set determination for genomic selection. Theor Appl Genet 132:2781–2792. (PMID: 10.1007/s00122-019-03387-031267147)
Perez P, de los Campos G (2014) Genome-wide regression and prediction with the BGLR statistical package. Genetics 198:483–495. (PMID: 10.1534/genetics.114.164442250091514196607)
Ratkowsky DA (1983) Nonlinear regression modeling: a unified practical approach. Marcel Dekker Inc, New York.
Ratkowsky DA (1993) Principles of nonlinear regression modeling. J Ind Microbiol 12:195–199. (PMID: 10.1007/BF01584190)
R Core Team (2019) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna.
Rincent R, Laloë D, Nicolas S, Altmann T, Brunel D et al (2012) Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.). Genetics 192:715–728. (PMID: 10.1534/genetics.112.141473228657333454892)
Rincent R, Charcosset A, Moreau L (2017) Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations. Theor Appl Genet 130:2231–2247. (PMID: 10.1007/s00122-017-2956-7287952025641287)
Rio S, Akdemir D, Carvalho T, Sanchez JIY (2022) Assessment of genomic prediction reliability and optimization of experimental designs in multi-environment trials. Thero Appl Genet 135:405–19. (PMID: 10.1007/s00122-021-03972-2)
Spindel J, Begum H, Akdemir D, Virk P, Collard B et al (2015) Genomic selection and association mapping in rice (Oryza sativa): effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines. PLoS Genet 11:e1004982. (PMID: 10.1371/journal.pgen.1004982256892734334555)
Stewart-Brown BB, Song Q, Vaughn JN, Li Z (2019) Genomic selection for yield and seed composition traits within an applied soybean breeding program. G3 Genes Genomes Genet 9:2253–2265.
Tsai SF, Shen CC, Liao CT (2021) Bayesian approaches for identifying the best genotype from a candidate population. JABES 26:519–537. (PMID: 10.1007/s13253-021-00454-2)
Wu PY, Tung CW, Lee CY, Liao CT (2019) Genomic prediction of pumpkin hybrid performance. Plant Gen 12:180082. (PMID: 10.3835/plantgenome2018.10.0082)
Zhang H, Yin L, Wang M, Yuan X, Liu X (2019) Factors affecting the accuracy of genomic selection for agricultural economic traits in maize, cattle, and pig populations. Front Genet 10:189. (PMID: 10.3389/fgene.2019.00189309235356426750)
Zhao K, Tung CW, Eizenga GC, Wright MH, Ali ML et al (2011) Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa. Nat Commun 2:467. (PMID: 10.1038/ncomms146721915109)
Zhong S, Dekkers JCM, Fernando RL, Jannink JL (2009) Factors affecting accuracy from genomic selection in populations derived from multiple inbred lines: A barley case study. Genetics 182:355–364. (PMID: 10.1534/genetics.108.098277192993422674832)
Grant Information:
MOST 110-2118-M-002-002-MY2 Ministry of Science and Technology, Taiwan
Entry Date(s):
Date Created: 20230313 Date Completed: 20230315 Latest Revision: 20230402
Update Code:
20240104
PubMed Central ID:
PMC10011335
DOI:
10.1007/s00122-023-04254-9
PMID:
36912999
Czasopismo naukowe
Key Message: A practical approach is developed to determine a cost-effective optimal training set for selective phenotyping in a genomic prediction study. An R function is provided to facilitate the application of the approach. Genomic prediction (GP) is a statistical method used to select quantitative traits in animal or plant breeding. For this purpose, a statistical prediction model is first built that uses phenotypic and genotypic data in a training set. The trained model is then used to predict genomic estimated breeding values (GEBVs) for individuals within a breeding population. Setting the sample size of the training set usually takes into account time and space constraints that are inevitable in an agricultural experiment. However, the determination of the sample size remains an unresolved issue for a GP study. By applying the logistic growth curve to identify prediction accuracy for the GEBVs and the training set size, a practical approach was developed to determine a cost-effective optimal training set for a given genome dataset with known genotypic data. Three real genome datasets were used to illustrate the proposed approach. An R function is provided to facilitate widespread application of this approach to sample size determination, which can help breeders to identify a set of genotypes with an economical sample size for selective phenotyping.
(© 2023. The Author(s).)

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies