Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
The Plant Genome
Many important traits in plant breeding are polygenic and therefore recalcitrant to traditional marker-assisted selection.Genomic selection addresses this complexity by including all markers in the prediction model. A key method for the genomic prediction of breeding values is ridge regression (RR), which is equivalent to best linear unbiased prediction (BLUP) when the genetic covariance between lines is proportional to their similarity in genotype space. This additive model can be broadened to include epistatic effects by using other kernels, such as the Gaussian, which represent inner products in a complex feature space. To facilitate the use of RR and nonadditive kernels in plant breeding, a new software package for R called rrBLUP has been developed. At its core is a fast maximum-likelihood algorithm for mixed models with a single variance component besides the residual error, which allows for effi cient prediction with unreplicated training data. Use of the rrBLUP software is demonstrated through several examples, including the identifi cation of optimal crosses based on superior progeny value. In cross-validation tests, the prediction accuracy with nonadditive kernels was signifi cantly higher than RR for wheat (Triticum aestivum L.) grain yield but equivalent for several maize (Zea mays L.) traits.
The Plant Genome Journal, 2011
Genomic selection (GS) uses genome-wide molecular marker data to predict the genetic value of selection candidates in breeding programs. In plant breeding, the ability to produce large numbers of progeny per cross allows GS to be conducted within each family. However, this approach requires phenotypes of lines from each cross before conducting GS. This will prolong the selection cycle and may result in lower gains per year than approaches that estimate marker-effects with multiple families from previous selection cycles. In this study, phenotypic selection (PS), conventional marker-assisted selection (MAS), and GS prediction accuracy were compared for 13 agronomic traits in a population of 374 winter wheat (Triticum aestivum L.) advanced-cycle breeding lines. A cross-validation approach that trained and validated prediction accuracy across years was used to evaluate effects of model selection, training population size, and marker density in the presence of genotype × environment interactions (G×E). The average prediction accuracies using GS were 28% greater than with MAS and were 95% as accurate as PS. For net merit, the average accuracy across six selection indices for GS was 14% greater than for PS. These results provide empirical evidence that multifamily GS could increase genetic gain per unit time and cost in plant breeding.
Journal of Crop Improvement, 2011
Genomic selection (GS) has been implemented in animal and plant species, and is regarded as a useful tool for accelerating genetic gains. Varying levels of genomic prediction accuracy have been obtained in plants, depending on the prediction problem assessed and on several other factors, such as trait heritability, the relationship between the individuals to be predicted and those used to train the models for prediction, number of markers, sample size and genotype  environment interaction (GE). The main objective of this article is to describe the results of genomic prediction in International Maize and Wheat Improvement Center's (CIMMYT's) maize and wheat breeding programs, from the initial assessment of the predictive ability of different models using pedigree and marker information to the present, when methods for implementing GS in practical global maize and wheat breeding programs are being studied and investigated. Results show that pedigree (population structure) accounts for a sizeable proportion of the prediction accuracy when a global population is the prediction problem to be assessed. However, when the prediction uses unrelated populations to train the prediction equations, prediction accuracy becomes negligible. When genomic prediction includes modeling GE, an increase in prediction accuracy can be achieved by borrowing information from correlated environments. Several questions on how to incorporate GS into CIMMYT's maize and wheat programs remain unanswered and subject to further investigation, for example, prediction within and between related biparental crosses. Further research on the quantification of breeding value components for GS in plant breeding populations is required.
Genetics
The availability of dense molecular markers has made possible the use of genomic selection (GS) for plant breeding. However, the evaluation of models for GS in real plant populations is very limited. This article evaluates the performance of parametric and semiparametric models for GS using wheat (Triticum aestivum L.) and maize (Zea mays) data in which different traits were measured in several environmental conditions. The findings, based on extensive cross-validations, indicate that models including marker information had higher predictive ability than pedigree-based models. In the wheat data set, and relative to a pedigree model, gains in predictive ability due to inclusion of markers ranged from 7.7 to 35.7%. Correlation between observed and predictive values in the maize data set achieved values up to 0.79. Estimates of marker effects were different across environmental conditions, indicating that genotype 3 environment interaction is an important component of genetic variability. These results indicate that GS in plant breeding can be an effective strategy for selecting among lines whose phenotypes have yet to be observed.
2014
Genomic selection (GS) has been implemented in animal and plant species, and is regarded as a useful tool for accelerating genetic gains. Varying levels of genomic prediction accuracy have been obtained in plants, depending on the prediction problem assessed and on several other factors, such as trait heritability, the relationship between the individuals to be predicted and those used to train the models for prediction, number of markers, sample size and genotype  environment interaction (GE). The main objective of this article is to describe the results of genomic prediction in International Maize and Wheat Improvement Center's (CIMMYT's) maize and wheat breeding programs, from the initial assessment of the predictive ability of different models using pedigree and marker information to the present, when methods for implementing GS in practical global maize and wheat breeding programs are being studied and investigated. Results show that pedigree (population structure) accounts for a sizeable proportion of the prediction accuracy when a global population is the prediction problem to be assessed. However, when the prediction uses unrelated populations to train the prediction equations, prediction accuracy becomes negligible. When genomic prediction includes modeling GE, an increase in prediction accuracy can be achieved by borrowing information from correlated environments. Several questions on how to incorporate GS into CIMMYT's maize and wheat programs remain unanswered and subject to further investigation, for example, prediction within and between related biparental crosses. Further research on the quantification of breeding value components for GS in plant breeding populations is required.
2010
The availability of thousands of genome wide molecular markers has made possible the use of genomic selection in plants and animals. However, the evaluation of models for genomic selection in plant breeding populations is very limited. In this study, we provide an overview of several models for genomic selection, whose predictive ability we investigated using two plant data sets. One data set contains the historical phenotypic records of a series of wheat (Triticum aestivum L.) trials and recently generated genomic data. The other data set pertains to international maize (Zea mays L.) trials in which two disease traits (Exserohilum turcicum and Cercospora zeae-maydis) were measured in maize lines evaluated in five international environments. Results showed that models including marker information yield important gains in predictive ability, relative to that of a pedigree-based model––this, with a modest number of markers. Estimates of marker effects were different across environment...
G3 (Bethesda, Md.), 2016
This study examines genomic prediction within 8416 Mexican landrace accessions and 2403 Iranian landrace accessions stored in gene banks. The Mexican and Iranian collections were evaluated in separate field trials, including an optimum environment for several traits, and in two separate environments (drought, D and heat, H) for the highly heritable traits, days to heading (DTH) and days to maturity (DTM). Analyses accounting and not accounting for population structure were performed. Genomic prediction models include genotype × environment interaction (G×E). Two alternative prediction strategies were studied: (1) random cross-validation of the data in 20% training (TRN) and 80% testing (TST) (TRN20-TST80) sets, and (2) two types of core sets, "diversity" and "prediction", including 10% and 20%, respectively, of the total collections were formed. Accounting for population structure decreased prediction accuracy by 15%-20% as compared to prediction accuracy obtaine...
Pesquisa Agropecuária Tropical
Mixed models and multivariate analysis are powerful tools for selecting superior genotypes in plant breeding programs. The BLUP (best linear unbiased prediction) method has been used to predict genetic values without environmental effects. Furthermore, the FAI-BLUP (ideotype-design index) procedure is especially valuable for plant breeding because of multiple-trait selection. This study aimed to determine the genetic potential of advanced wheat generations using REML/BLUP in combination with multivariate techniques for the selection of superior genotypes. The experiment consisted of eleven wheat (Triticum aestivum L.) genotypes. The experimental design was randomized blocks, with three replications. Plant height, spike insertion height, number of tillers, number of spikelets, kernel width, hectoliter weight and kernel weight per plant were determined. The genetic parameters were estimated using the REML/BLUP methodology, and the FAI-BLUP index was calculated using predicted genetic ...
Plant Breeding, 2013
Genomic selection (GS) is a promising alternative to marker-assisted selection particularly for quantitative traits. In this study, we examined the prediction accuracy of genomic breeding values by using ridge regression best linear unbiased prediction in combination with fivefold cross-validation based on empirical data of a commercial maize breeding programme. The empirical data is composed of 930 testcross progenies derived from 11 segregating families evaluated at six environments for grain yield and grain moisture. Accuracy to predict genomic breeding values was affected by the choice of the shrinkage parameter k 2 , by unbalanced family size, by size of the training population and to a lower extent by the number of markers. Accuracy of genomic breeding values was high suggesting that the selection gain can be improved implementing GS in elite maize breeding programmes.
Plant Breeding, 2020
Genomic selection has been adopted in many plant breeding programmes. In this paper, we cover some aspects of information necessary before starting genomic selection. Spring oat and barley breeding data sets from commercial breeding programmes were studied using single, multitrait and trait-assisted models for predicting grain yield. Heritabilities were higher when estimated using multitrait models compared to single-trait models. However, no corresponding increase in prediction accuracy was observed in a cross-validation scenario. On the other hand, forward prediction showed a slight, but not significant, increase in accuracy of genomic estimated breeding values for breeding cohorts when a multitrait model was applied. When a correlated trait was used in a trait-assisted model, on average the accuracies increased by 9%-14% for oat and by 11%-28% for barley compared with a single-trait model. Overall, accuracies in forward validation varied between breeding cohorts and years for grain yield. Forward prediction accuracies for multiple cohorts and multiple years' data are reported for oat for the first time. K E Y W O R D S barley, commercial breeding programme, genomic prediction, grain yield, multitrait model, oat How to cite this article: Haikka H, Knürr T, Manninen O, et al. Genomic prediction of grain yield in commercial Finnish oat (Avena sativa) and barley (Hordeum vulgare) breeding programmes.
The Plant Genome, 2017
The single most important decision in plant breeding programs is the selection of appropriate crosses. The ideal cross would provide superior predicted progeny performance and enough diversity to maintain genetic gain. The aim of this study was to compare the best crosses predicted using combinations of midparent value and variance prediction accounting for linkage disequilibrium (V LD) or assuming linkage equilibrium (V LE). After predicting the mean and the variance of each cross, we selected crosses based on mid-parent value, the top 10% of the progeny, and weighted mean and variance within progenies for grain yield, grain protein content, mixing time, and loaf volume in two applied wheat (Triticum aestivum L.) breeding programs: Instituto Nacional de Investigación Agropecuaria (INIA) Uruguay and CIMMYT Mexico. Although the variance of the progeny is important to increase the chances of finding superior individuals from transgressive segregation, we observed that the mid-parent values of the crosses drove the genetic gain but the variance of the progeny had a small impact on genetic gain for grain yield. However, the relative importance of the variance of the progeny was larger for quality traits. Overall, the genomic resources and the statistical models are now available to plant breeders to predict both the performance of breeding lines per se as well as the value of progeny from any potential crosses. T he main objective of plant breeding is to increase the yield, productivity, adaptation, and quality of crops while optimizing resource use (Allard 1960). Genetic gain in plant breeding is accomplished through the selection of best genetic combinations between
G3 (Bethesda, Md.), 2015
Genomic Selection (GS) models use genome-wide genetic information to predict genetic values of candidates of selection. Originally, these models were developed without considering genotype × environment interaction (G×E). Several authors have proposed extensions of the single-environment GS model that accommodate G×E using either co-variance functions or environmental covariates. In this study, we model G×E using a marker × environment interaction (M×E) GS model; the approach is conceptually simple and can be implemented using existing GS software. We discuss how the model can be implemented by using an explicit regression of phenotypes on markers or using co-variance structures (a GBLUP-type model). We used the M×E model to analyze three CIMMYT wheat data sets (W1, W2, and W3), where over 1,000 lines were genotyped using genotyping-by-sequencing and evaluated at CIMMYT's research station in Ciudad Obregon, Mexico, under simulated environmental conditions that covered different ...
Frontiers in Plant Science, 2022
We investigated increasing genetic gain for grain yield using early generation genomic selection (GS). A training set of 1,334 elite wheat breeding lines tested over three field seasons was used to generate Genomic Estimated Breeding Values (GEBVs) for grain yield under irrigated conditions applying markers and three different prediction methods: (1) Genomic Best Linear Unbiased Predictor (GBLUP), (2) GBLUP with the imputation of missing genotypic data by Ridge Regression BLUP (rrGBLUP_imp), and (3) Reproducing Kernel Hilbert Space (RKHS) a.k.a. Gaussian Kernel (GK). F2 GEBVs were generated for 1,924 individuals from 38 biparental cross populations between 21 parents selected from the training set. Results showed that F2 GEBVs from the different methods were not correlated. Experiment 1 consisted of selecting F2s with the highest average GEBVs and advancing them to form genomically selected bulks and make intercross populations aiming to combine favorable alleles for yield. F4:6 lin...
Genomic selection (GS) uses genomewide molecular markers to predict breeding values and make selections of individuals or breeding lines prior to phenotyping. Here we show that genotyping-by-sequencing (GBS) can be used for de novo genotyping of breeding panels and to develop accurate GS models, even for the large, complex, and polyploid wheat (Triticum aestivum L.) genome. With GBS we discovered 41,371 single nucleotide polymorphisms (SNPs) in a set of 254 advanced breeding lines from CIMMYT's semiarid wheat breeding program. Four different methods were evaluated for imputing missing marker scores in this set of unmapped markers, including random forest regression and a newly developed multivariate-normal expectation-maximization algorithm, which gave more accurate imputation than heterozygous or mean imputation at the marker level, although no signifi cant differences were observed in the accuracy of genomic-estimated breeding values (GEBVs) among imputation methods. Genomic-estimated breeding value prediction accuracies with GBS were 0.28 to 0.45 for grain yield, an improvement of 0.1 to 0.2 over an established marker platform for wheat. Genotyping-bysequencing combines marker discovery and genotyping of large populations, making it an excellent marker platform for breeding applications even in the absence of a reference genome sequence or previous polymorphism discovery. In addition, the fl exibility and low cost of GBS make this an ideal approach for genomics-assisted breeding.
2020
Genomic selection (GS) is a model-based approach in plant breeding that utilizes genomic estimated breeding values (GEBVs) of breeding lines to predict breeding outcomes in an effective and efficient manner. It basically establishes links between phenotypes and genetic markers to accelerate the genetic gain in plant breeding (Wang et al., 2018). Initially, it was established in the animal breeding because of inability of animals to replicate and the high cost of phenotyping (Rutkoski et al., 2017). GS gained limelight in plant breeding because it achieves more and comprehensive selection as compared with other conventional breeding tools that mostly rely on phenotype selection. It uses genomic prediction models based on genome-wide prediction markers and phenotypic traits of the training population (TRN) to predict the GEBVs of the testing population (TSN) that only have genotypic data. Then the GEBVs of those lines in TSN will be utilized to process the selection for the next breeding cycle as described in Fig. 23.1 (Lorenz et al., 2011). However, the advancement of GS in plant breeding field is relatively far away behind as compared with animal breeding, in which the initial implantation was performed in 2007 based on the simulated data in maize (Bernardo and Yu, 2007). GS is a sensitive selection process that even accounts small-effect markers with potential to interfere with significance of test. It has showed significant advantages in plant breeding as compared with traditional marker-assisted selection (MAS) especially for complex quantitative traits with low heritability and regulated by various loci with small effects. It is able to capture more variations and to improve selections that involve genome-wide spread genetic markers. Moreover, GS reduces breeding cycles by improving the genetic gain per unit time, which is demonstrated by the breeder’s equation (G ¼ irsA Y , where G denotes gain per year, i denotes selection intensity, r denotes selection accuracy, sA denotes square root of narrow sense heritability, and Y denotes time in years for a cycle of selection). According to the equation, larger genetic gain can be achieved, compared with traditional phenotypic selection (PS), by reducing the duration of breeding cycles, which allows lowering the cost of phenotyping in a long term. The development of next-generation sequencing (NGS) technology resulted in the reduction in the cost of the genotyping, consequently inspiring the extensive application of GS in different breeding programs of the plant community to enhance genetic gains and speed up crop selection.
Theoretical and Applied Genetics, 2011
This is the first large-scale experimental study on genome-based prediction of testcross values in an advanced cycle breeding population of maize. The study comprised testcross progenies of 1,380 doubled haploid lines of maize derived from 36 crosses and phenotyped for grain yield and grain dry matter content in seven locations. The lines were genotyped with 1,152 single nucleotide polymorphism markers. Pedigree data were available for three generations. We used best linear unbiased prediction and stratified cross-validation to evaluate the performance of prediction models differing in the modeling of relatedness between inbred lines and in the calculation of genomebased coefficients of similarity. The choice of similarity coefficient did not affect prediction accuracies. Models including genomic information yielded significantly higher prediction accuracies than the model based on pedigree information alone. Average prediction accuracies based on genomic data were high even for a complex trait like grain yield (0.72-0.74) when the cross-validation scheme allowed for a high degree of relatedness between the estimation and the test set. When predictions were performed across distantly related families, prediction accuracies decreased significantly (0.47-0.48). Prediction accuracies decreased with decreasing sample size but were still high when the population size was halved (0.67-0.69). The results from this study are encouraging with respect to genome-based prediction of the genetic value of untested lines in advanced cycle breeding populations and the implementation of genomic selection in the breeding process.
The Plant Genome, 2015
Prediction accuracy of genomic selection (GS) has been previously evaluated through simulation and cross-validation; however, validation based on progeny performance in a plant breeding program has not been investigated thoroughly. We evaluated several prediction models in a dynamic barley breeding population comprised of 647 six-row lines using four traits differing in genetic architecture and 1536 single nucleotide polymorphism (SNP) markers. The breeding lines were divided into six sets designated as one parent set and five consecutive progeny sets comprised of representative samples of breeding lines over a 5-yr period. We used these data sets to investigate the effect of model and training population composition on prediction accuracy over time. We found little difference in prediction accuracy among the models confirming prior studies that found the simplest model, random regression best linear unbiased prediction (RR-BLUP), to be accurate across a range of situations. In general, we found that using the parent set was sufficient to predict progeny sets with little to no gain in accuracy from generating larger training populations by combining the parent set with subsequent progeny sets. The prediction accuracy ranged from 0.03 to 0.99 across the four traits and five progeny sets. We explored characteristics of the training and validation populations (marker allele frequency, population structure, and linkage disequilibrium, LD) as well as characteristics of the trait (genetic architecture and heritability, H 2). Fixation of markers associated with a trait over time was most clearly associated with reduced prediction accuracy for the mycotoxin trait DON. Higher trait H 2 in the training population and simpler trait architecture were associated with greater prediction accuracy. G enomic selection is touted as a marker-based breeding approach that complements traditional markerassisted selection (MAS) and phenotypic selection. In traditional MAS, favorable alleles or genes for relatively simply inherited traits are mapped and then molecular markers linked to those alleles are used to select individuals to use as parents or to advance from segregating breeding populations (Bernardo, 2008). Marker-assisted selection is more effective than phenotypic selection if the tagged loci account for a large portion of the total genetic variation within the population of selection candidates (Collins et al., 2003; Castro et al., 2003; Xu and Crouch, 2008). The limitation of traditional MAS for highly complex traits is that it captures only a small portion of the total genetic variation because it uses a limited number of selected markers (Lande and Thompson, 1990; Bernardo, 2010). Phenotypic selection is effective on quantitative traits, but is limited to stages in breeding cycles and environments where such traits can be measured effectively, such as for advanced lines in multiple location field trials. Therefore, GS can be strategically implemented in
Theoretical and Applied Genetics, 2019
Key message The optimization of training populations and the use of diagnostic markers as fixed effects increase the predictive ability of genomic prediction models in a cooperative wheat breeding panel. Abstract Plant breeding programs often have access to a large amount of historical data that is highly unbalanced, particularly across years. This study examined approaches to utilize these data sets as training populations to integrate genomic selection into existing pipelines. We used cross-validation to evaluate predictive ability in an unbalanced data set of 467 winter wheat (Triticum aestivum L.) genotypes evaluated in the Gulf Atlantic Wheat Nursery from 2008 to 2016. We evaluated the impact of different training population sizes and training population selection methods (Random, Clustering, PEVmean and PEVmean1) on predictive ability. We also evaluated inclusion of markers associated with major genes as fixed effects in prediction models for heading date, plant height, and resistance to powdery mildew (caused by Blumeria graminis f. sp. tritici). Increases in predictive ability as the size of the training population increased were more evident for Random and Clustering training population selection methods than for PEVmean and PEVmean1. The selection methods based on minimization of the prediction error variance (PEV) outperformed the Random and Clustering methods across all the population sizes. Major genes added as fixed effects always improved model predictive ability, with the greatest gains coming from combinations of multiple genes. Maximum predictabilities among all prediction methods were 0.64 for grain yield, 0.56 for test weight, 0.71 for heading date, 0.73 for plant height, and 0.60 for powdery mildew resistance. Our results demonstrate the utility of combining unbalanced phenotypic records with genome-wide SNP marker data for predicting the performance of untested genotypes.
Frontiers in Ecology and Evolution
Seed traits of bread wheat, including the seed size that is considered to be associated with early vigor of the crop and end-use quality, are valuable to farmers and breeders. In this study, a collection of 789 bread wheat landraces, held in-trust at the genebank of the International Center for Agricultural Research in the Dry Areas (ICARDA) were scanned for seed morphometric traits using GrainScan. Diversity analysis using the 12k DartSeq SNP markers revealed that these accessions can be grouped into five distinct clusters. To evaluate the performance for early selection from genebank accessions, we examined the accuracy of genomic selection models with genomic relationship that these landraces accounted for. Based on cross-validations, prediction accuracies for seed traits ranged from 0.64 for seed perimeter to 0.74 for seed width. The variability of prediction accuracies across random validations averaged at 0.14, with a range from 0.12 to 0.18, suggesting stable predictability and reproducible results even with a collection of much greater genetic diversity from genebank accessions. Adding the climatic relationship matrix between accessions based on passport information improved the predictive ability by 8%. Our results on seed traits demonstrated the capacity for estimating important agronomic phenotypes for genebank accessions directly based on genomic information, further advocating the advance in genomic technologies for identifying parental germplasm as potential donors of beneficial alleles for introgression.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.