Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2011, Http Dx Doi Org 10 1080 15427528 2011 558767
Genomic selection (GS) has been implemented in animal and plant species, and is regarded as a useful tool for accelerating genetic gains. Varying levels of genomic prediction accuracy have been obtained in plants, depending on the prediction problem assessed and on several other factors, such as trait heritability, the relationship between the individuals to be predicted and those used to train the models for prediction, number of markers, sample size and genotype  environment interaction (GE). The main objective of this article is to describe the results of genomic prediction in International Maize and Wheat Improvement Center's (CIMMYT's) maize and wheat breeding programs, from the initial assessment of the predictive ability of different models using pedigree and marker information to the present, when methods for implementing GS in practical global maize and wheat breeding programs are being studied and investigated. Results show that pedigree (population structure) accounts for a sizeable proportion of the prediction accuracy when a global population is the prediction problem to be assessed. However, when the prediction uses unrelated populations to train the prediction equations, prediction accuracy becomes negligible. When genomic prediction includes modeling GE, an increase in prediction accuracy can be achieved by borrowing information from correlated environments. Several questions on how to incorporate GS into CIMMYT's maize and wheat programs remain unanswered and subject to further investigation, for example, prediction within and between related biparental crosses. Further research on the quantification of breeding value components for GS in plant breeding populations is required.
2014
Genomic selection (GS) has been implemented in animal and plant species, and is regarded as a useful tool for accelerating genetic gains. Varying levels of genomic prediction accuracy have been obtained in plants, depending on the prediction problem assessed and on several other factors, such as trait heritability, the relationship between the individuals to be predicted and those used to train the models for prediction, number of markers, sample size and genotype  environment interaction (GE). The main objective of this article is to describe the results of genomic prediction in International Maize and Wheat Improvement Center's (CIMMYT's) maize and wheat breeding programs, from the initial assessment of the predictive ability of different models using pedigree and marker information to the present, when methods for implementing GS in practical global maize and wheat breeding programs are being studied and investigated. Results show that pedigree (population structure) accounts for a sizeable proportion of the prediction accuracy when a global population is the prediction problem to be assessed. However, when the prediction uses unrelated populations to train the prediction equations, prediction accuracy becomes negligible. When genomic prediction includes modeling GE, an increase in prediction accuracy can be achieved by borrowing information from correlated environments. Several questions on how to incorporate GS into CIMMYT's maize and wheat programs remain unanswered and subject to further investigation, for example, prediction within and between related biparental crosses. Further research on the quantification of breeding value components for GS in plant breeding populations is required.
Genetics
The availability of dense molecular markers has made possible the use of genomic selection (GS) for plant breeding. However, the evaluation of models for GS in real plant populations is very limited. This article evaluates the performance of parametric and semiparametric models for GS using wheat (Triticum aestivum L.) and maize (Zea mays) data in which different traits were measured in several environmental conditions. The findings, based on extensive cross-validations, indicate that models including marker information had higher predictive ability than pedigree-based models. In the wheat data set, and relative to a pedigree model, gains in predictive ability due to inclusion of markers ranged from 7.7 to 35.7%. Correlation between observed and predictive values in the maize data set achieved values up to 0.79. Estimates of marker effects were different across environmental conditions, indicating that genotype 3 environment interaction is an important component of genetic variability. These results indicate that GS in plant breeding can be an effective strategy for selecting among lines whose phenotypes have yet to be observed.
2010
The availability of thousands of genome wide molecular markers has made possible the use of genomic selection in plants and animals. However, the evaluation of models for genomic selection in plant breeding populations is very limited. In this study, we provide an overview of several models for genomic selection, whose predictive ability we investigated using two plant data sets. One data set contains the historical phenotypic records of a series of wheat (Triticum aestivum L.) trials and recently generated genomic data. The other data set pertains to international maize (Zea mays L.) trials in which two disease traits (Exserohilum turcicum and Cercospora zeae-maydis) were measured in maize lines evaluated in five international environments. Results showed that models including marker information yield important gains in predictive ability, relative to that of a pedigree-based model––this, with a modest number of markers. Estimates of marker effects were different across environment...
Theoretical and Applied Genetics, 2012
Genomic selection is a promising breeding strategy for rapid improvement of complex traits. The objective of our study was to investigate the prediction accuracy of genomic breeding values through cross validation. The study was based on experimental data of six segregating populations from a half-diallel mating design with 788 testcross progenies from an elite maize breeding program. The plants were intensively phenotyped in multilocation field trials and fingerprinted with 960 SNP markers. We used random regression best linear unbiased prediction in combination with fivefold cross validation. The prediction accuracy across populations was higher for grain moisture (0.90) than for grain yield (0.58). The accuracy of genomic selection realized for grain yield corresponds to the precision of phenotyping at unreplicated field trials in 3-4 locations. As for maize up to three generations are feasible per year, selection gain per unit time is high and, consequently, genomic selection holds great promise for maize breeding programs.
Frontiers in plant science, 2017
Genomic selection is being used increasingly in plant breeding to accelerate genetic gain per unit time. One of the most important applications of genomic selection in maize breeding is to predict and select the best un-phenotyped lines in bi-parental populations based on genomic estimated breeding values. In the present study, 22 bi-parental tropical maize populations genotyped with low density SNPs were used to evaluate the genomic prediction accuracy (rMG ) of the six trait-environment combinations under various levels of training population size (TPS) and marker density (MD), and assess the effect of trait heritability (h(2) ), TPS and MD on rMG estimation. Our results showed that: (1) moderate rMG values were obtained for different trait-environment combinations, when 50% of the total genotypes was used as training population and ~200 SNPs were used for prediction; (2) rMG increased with an increase in h(2) , TPS and MD, both correlation and variance analyses showed that h(2) i...
G3 (Bethesda, Md.), 2012
Genomic prediction is expected to considerably increase genetic gains by increasing selection intensity and accelerating the breeding cycle. In this study, marker effects estimated in 255 diverse maize (Zea mays L.) hybrids were used to predict grain yield, anthesis date, and anthesis-silking interval within the diversity panel and testcross progenies of 30 F(2)-derived lines from each of five populations. Although up to 25% of the genetic variance could be explained by cross validation within the diversity panel, the prediction of testcross performance of F(2)-derived lines using marker effects estimated in the diversity panel was on average zero. Hybrids in the diversity panel could be grouped into eight breeding populations differing in mean performance. When performance was predicted separately for each breeding population on the basis of marker effects estimated in the other populations, predictive ability was low (i.e., 0.12 for grain yield). These results suggest that predict...
Trends in plant science, 2017
Genomic selection (GS) facilitates the rapid selection of superior genotypes and accelerates the breeding cycle. In this review, we discuss the history, principles, and basis of GS and genomic-enabled prediction (GP) as well as the genetics and statistical complexities of GP models, including genomic genotype×environment (G×E) interactions. We also examine the accuracy of GP models and methods for two cereal crops and two legume crops based on random cross-validation. GS applied to maize breeding has shown tangible genetic gains. Based on GP results, we speculate how GS in germplasm enhancement (i.e., prebreeding) programs could accelerate the flow of genes from gene bank accessions to elite lines. Recent advances in hyperspectral image technology could be combined with GS and pedigree-assisted breeding.
Advances in Agronomy, 2011
Genomic selection," the ability to select for even complex, quantitative traits based on marker data alone, has arisen from the conjunction of new highthroughput marker technologies and new statistical methods needed to analyze the data. This review surveys what is known about these technologies, with sections on population and quantitative genetic background, DNA marker development, statistical methods, reported accuracies of genomic selection (GS) predictions, prediction of nonadditive genetic effects, prediction in the presence of subpopulation structure, and impacts of GS on long-term gain. GS works by estimating the effects of many loci spread across the genome. Marker and observation numbers therefore need to scale with the genetic map length in Morgans and with the effective population size of the population under GS. For typical crops, the requirements range from at least 200 to at most 10,000 markers and observations. With that baseline, GS can greatly accelerate the breeding cycle while also using marker information to maintain genetic diversity and potentially prolong gain beyond what is possible with phenotypic selection. With the costs of marker technologies continuing to decline and the statistical methods becoming more routine, the results reviewed here suggest that GS will play a large role in the plant breeding of the future. Our summary and interpretation should prove useful to breeders as they assess the value of GS in the context of their populations and resources.
The Plant Genome, 2015
Prediction accuracy of genomic selection (GS) has been previously evaluated through simulation and cross-validation; however, validation based on progeny performance in a plant breeding program has not been investigated thoroughly. We evaluated several prediction models in a dynamic barley breeding population comprised of 647 six-row lines using four traits differing in genetic architecture and 1536 single nucleotide polymorphism (SNP) markers. The breeding lines were divided into six sets designated as one parent set and five consecutive progeny sets comprised of representative samples of breeding lines over a 5-yr period. We used these data sets to investigate the effect of model and training population composition on prediction accuracy over time. We found little difference in prediction accuracy among the models confirming prior studies that found the simplest model, random regression best linear unbiased prediction (RR-BLUP), to be accurate across a range of situations. In general, we found that using the parent set was sufficient to predict progeny sets with little to no gain in accuracy from generating larger training populations by combining the parent set with subsequent progeny sets. The prediction accuracy ranged from 0.03 to 0.99 across the four traits and five progeny sets. We explored characteristics of the training and validation populations (marker allele frequency, population structure, and linkage disequilibrium, LD) as well as characteristics of the trait (genetic architecture and heritability, H 2). Fixation of markers associated with a trait over time was most clearly associated with reduced prediction accuracy for the mycotoxin trait DON. Higher trait H 2 in the training population and simpler trait architecture were associated with greater prediction accuracy. G enomic selection is touted as a marker-based breeding approach that complements traditional markerassisted selection (MAS) and phenotypic selection. In traditional MAS, favorable alleles or genes for relatively simply inherited traits are mapped and then molecular markers linked to those alleles are used to select individuals to use as parents or to advance from segregating breeding populations (Bernardo, 2008). Marker-assisted selection is more effective than phenotypic selection if the tagged loci account for a large portion of the total genetic variation within the population of selection candidates (Collins et al., 2003; Castro et al., 2003; Xu and Crouch, 2008). The limitation of traditional MAS for highly complex traits is that it captures only a small portion of the total genetic variation because it uses a limited number of selected markers (Lande and Thompson, 1990; Bernardo, 2010). Phenotypic selection is effective on quantitative traits, but is limited to stages in breeding cycles and environments where such traits can be measured effectively, such as for advanced lines in multiple location field trials. Therefore, GS can be strategically implemented in
Plant Breeding, 2013
Genomic selection (GS) is a promising alternative to marker-assisted selection particularly for quantitative traits. In this study, we examined the prediction accuracy of genomic breeding values by using ridge regression best linear unbiased prediction in combination with fivefold cross-validation based on empirical data of a commercial maize breeding programme. The empirical data is composed of 930 testcross progenies derived from 11 segregating families evaluated at six environments for grain yield and grain moisture. Accuracy to predict genomic breeding values was affected by the choice of the shrinkage parameter k 2 , by unbalanced family size, by size of the training population and to a lower extent by the number of markers. Accuracy of genomic breeding values was high suggesting that the selection gain can be improved implementing GS in elite maize breeding programmes.
Genotyping-by-sequencing (GBS) technologies have proven capacity for delivering large 2 numbers of marker genotypes with potentially less ascertainment bias than standard SNP arrays. 3 Therefore, GBS has become an attractive alternative technology for genomic selection. However, 4 the use of GBS data poses important challenges and the accuracy of genomic prediction using 5 GBS is currently under investigation in several crops, including maize, wheat, and cassava. The 6 main objective of this study was to evaluate various methods for incorporating GBS information 7 and compare them with pedigree models for predicting genetic values of lines from two maize 8 populations evaluated for different traits measured in different environments (Experiments 1 and 9
The Plant Genome Journal, 2011
Genomic selection (GS) uses genome-wide molecular marker data to predict the genetic value of selection candidates in breeding programs. In plant breeding, the ability to produce large numbers of progeny per cross allows GS to be conducted within each family. However, this approach requires phenotypes of lines from each cross before conducting GS. This will prolong the selection cycle and may result in lower gains per year than approaches that estimate marker-effects with multiple families from previous selection cycles. In this study, phenotypic selection (PS), conventional marker-assisted selection (MAS), and GS prediction accuracy were compared for 13 agronomic traits in a population of 374 winter wheat (Triticum aestivum L.) advanced-cycle breeding lines. A cross-validation approach that trained and validated prediction accuracy across years was used to evaluate effects of model selection, training population size, and marker density in the presence of genotype × environment interactions (G×E). The average prediction accuracies using GS were 28% greater than with MAS and were 95% as accurate as PS. For net merit, the average accuracy across six selection indices for GS was 14% greater than for PS. These results provide empirical evidence that multifamily GS could increase genetic gain per unit time and cost in plant breeding.
Theoretical and Applied Genetics
Key message Historical data from breeding programs can be efficiently used to improve genomic selection accuracy, especially when the training set is optimized to subset individuals most informative of the target testing set. Abstract The current strategy for large-scale implementation of genomic selection (GS) at the International Maize and Wheat Improvement Center (CIMMYT) global maize breeding program has been to train models using information from full-sibs in a “test-half-predict-half approach.” Although effective, this approach has limitations, as it requires large full-sib populations and limits the ability to shorten variety testing and breeding cycle times. The primary objective of this study was to identify optimal experimental and training set designs to maximize prediction accuracy of GS in CIMMYT’s maize breeding programs. Training set (TS) design strategies were evaluated to determine the most efficient use of phenotypic data collected on relatives for genomic predicti...
2021
Reductions of genotyping marker density have been extensively evaluated as potential strategies to reduce the genotyping costs of genomic selection (GS). Low-density marker panels are appealing in GS because they entail lower multicollinearity and computational time-consumption and allow more individuals to be genotyped for the same cost. However, statistical models used in GS are usually evaluated with empirical data, using "static" training sets and populations. This may be adequate for making predictions during a breeding program's initial cycles, but not for the long term. Moreover, to the best of our knowledge, no GS models consider the effect of dominance, which is particularly important for breeding outcomes in cross-pollinated crops. Hence, dominance effects are an important and unexplored issue in GS for long-term programs involving allogamous species. To address it, we employed two approaches: analysis of empirical maize datasets and simulations of long-term ...
G3 (Bethesda, Md.), 2015
Genomic Selection (GS) models use genome-wide genetic information to predict genetic values of candidates of selection. Originally, these models were developed without considering genotype × environment interaction (G×E). Several authors have proposed extensions of the single-environment GS model that accommodate G×E using either co-variance functions or environmental covariates. In this study, we model G×E using a marker × environment interaction (M×E) GS model; the approach is conceptually simple and can be implemented using existing GS software. We discuss how the model can be implemented by using an explicit regression of phenotypes on markers or using co-variance structures (a GBLUP-type model). We used the M×E model to analyze three CIMMYT wheat data sets (W1, W2, and W3), where over 1,000 lines were genotyped using genotyping-by-sequencing and evaluated at CIMMYT's research station in Ciudad Obregon, Mexico, under simulated environmental conditions that covered different ...
Theoretical and Applied Genetics, 2011
This is the first large-scale experimental study on genome-based prediction of testcross values in an advanced cycle breeding population of maize. The study comprised testcross progenies of 1,380 doubled haploid lines of maize derived from 36 crosses and phenotyped for grain yield and grain dry matter content in seven locations. The lines were genotyped with 1,152 single nucleotide polymorphism markers. Pedigree data were available for three generations. We used best linear unbiased prediction and stratified cross-validation to evaluate the performance of prediction models differing in the modeling of relatedness between inbred lines and in the calculation of genomebased coefficients of similarity. The choice of similarity coefficient did not affect prediction accuracies. Models including genomic information yielded significantly higher prediction accuracies than the model based on pedigree information alone. Average prediction accuracies based on genomic data were high even for a complex trait like grain yield (0.72-0.74) when the cross-validation scheme allowed for a high degree of relatedness between the estimation and the test set. When predictions were performed across distantly related families, prediction accuracies decreased significantly (0.47-0.48). Prediction accuracies decreased with decreasing sample size but were still high when the population size was halved (0.67-0.69). The results from this study are encouraging with respect to genome-based prediction of the genetic value of untested lines in advanced cycle breeding populations and the implementation of genomic selection in the breeding process.
The Plant Genome
Many important traits in plant breeding are polygenic and therefore recalcitrant to traditional marker-assisted selection.Genomic selection addresses this complexity by including all markers in the prediction model. A key method for the genomic prediction of breeding values is ridge regression (RR), which is equivalent to best linear unbiased prediction (BLUP) when the genetic covariance between lines is proportional to their similarity in genotype space. This additive model can be broadened to include epistatic effects by using other kernels, such as the Gaussian, which represent inner products in a complex feature space. To facilitate the use of RR and nonadditive kernels in plant breeding, a new software package for R called rrBLUP has been developed. At its core is a fast maximum-likelihood algorithm for mixed models with a single variance component besides the residual error, which allows for effi cient prediction with unreplicated training data. Use of the rrBLUP software is demonstrated through several examples, including the identifi cation of optimal crosses based on superior progeny value. In cross-validation tests, the prediction accuracy with nonadditive kernels was signifi cantly higher than RR for wheat (Triticum aestivum L.) grain yield but equivalent for several maize (Zea mays L.) traits.
Genetics, 2014
The efficiency of marker-assisted prediction of phenotypes has been studied intensively for different types of plant breeding populations. However, one remaining question is how to incorporate and counterbalance information from biparental and multiparental populations into model training for genome-wide prediction. To address this question, we evaluated testcross performance of 1652 doubled-haploid maize (Zea mays L.) lines that were genotyped with 56,110 single nucleotide polymorphism markers and phenotyped for five agronomic traits in four to six European environments. The lines are arranged in two diverse half-sib panels representing two major European heterotic germplasm pools. The data set contains 10 related biparental dent families and 11 related biparental flint families generated from crosses of maize lines important for European maize breeding. With this new data set we analyzed genome-based best linear unbiased prediction in different validation schemes and compositions of estimation and test sets. Further, we theoretically and empirically investigated marker linkage phases across multiparental populations. In general, predictive abilities similar to or higher than those within biparental families could be achieved by combining several half-sib families in the estimation set. For the majority of families, 375 half-sib lines in the estimation set were sufficient to reach the same predictive performance of biomass yield as an estimation set of 50 full-sib lines. In contrast, prediction across heterotic pools was not possible for most cases. Our findings are important for experimental design in genome-based prediction as they provide guidelines for the genetic structure and required sample size of data sets used for model training. the biparental family from which single nucleotide polymorphism (SNP) effects were derived. Thus, as for QTL mapping, similar arguments in favor of multiparental populations hold in the context of genome-based prediction.
The Plant Genome, 2017
The single most important decision in plant breeding programs is the selection of appropriate crosses. The ideal cross would provide superior predicted progeny performance and enough diversity to maintain genetic gain. The aim of this study was to compare the best crosses predicted using combinations of midparent value and variance prediction accounting for linkage disequilibrium (V LD) or assuming linkage equilibrium (V LE). After predicting the mean and the variance of each cross, we selected crosses based on mid-parent value, the top 10% of the progeny, and weighted mean and variance within progenies for grain yield, grain protein content, mixing time, and loaf volume in two applied wheat (Triticum aestivum L.) breeding programs: Instituto Nacional de Investigación Agropecuaria (INIA) Uruguay and CIMMYT Mexico. Although the variance of the progeny is important to increase the chances of finding superior individuals from transgressive segregation, we observed that the mid-parent values of the crosses drove the genetic gain but the variance of the progeny had a small impact on genetic gain for grain yield. However, the relative importance of the variance of the progeny was larger for quality traits. Overall, the genomic resources and the statistical models are now available to plant breeders to predict both the performance of breeding lines per se as well as the value of progeny from any potential crosses. T he main objective of plant breeding is to increase the yield, productivity, adaptation, and quality of crops while optimizing resource use (Allard 1960). Genetic gain in plant breeding is accomplished through the selection of best genetic combinations between
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.