|
|
||||||||
,1
* Faculty of Agriculture, Hebrew University of Jerusalem, Rehovot 76100, Israel
Institute of Animal Sciences, A.R.O., The Volcani Center, Bet Dagan 50250, Israel
1 Corresponding author: weller{at}agri.huji.ac.il
| ABSTRACT |
|---|
|
|
|---|
Key Words: marker-assisted selection quantitative trait loci animal model dairy cattle breeding
| INTRODUCTION |
|---|
|
|
|---|
Various solutions have been proposed to overcome these problems. Bolard and Boichard (2002) demonstrated that information from individuals with records could be incorporated into daughter and granddaughter designs used to estimate QTL effects. Kashi et al. (1990) and Mackinnon and Georges (1998) proposed a 2-stage scheme in which bull calves are first selected based on marker genotype and then the selected calves are progeny tested. Although these schemes require genotyping only a small fraction of the population, they do not properly weight QTL in contrast to a polygenic effect. Bull calves with overall genetic superiority based on both QTL and polygenic value will be culled in the first stage if they do not have the desired marker genotypes. In addition, these schemes can only be applied if the total number of QTL analyzed is small and the positive allele is relatively common for all QTL. Otherwise, it will be necessary to genotype a very large number of animals to find sufficient bull calves with the favorable genotype for all QTL.
Fernando and Grossman (1989) proposed modifying the individual animal model (AM) to a "gametic" model that assumes that the 2 QTL alleles of each individual are random effects sampled from a distribution with a known variance. They developed a method to estimate breeding values for all individuals in a population, including QTL effects via linkage to genetic markers, provided that all animals are genotyped, and the heritability and recombination frequency between the QTL and the genetic marker are known. The model of Fernando and Grossman (1989) is suitable for any population structure, and can incorporate nonlinked polygenic effects and other "nuisance" effects such as herd or block. Each individual with unknown ancestors is assumed to have 2 unique alleles for the QTL, which are sampled from an infinite population of alleles. If both the parent and progeny are genotyped for a linked genetic marker, then the probability of receiving a specific parental allele for a QTL linked to the genetic marker will be a function of parents and progeny marker genotypes and of recombination. Two ongoing MAS programs in dairy cattle have been reported so far, in German and French Holsteins (Bennewitz et al., 2004; Guillaume et al., 2008). In both programs, only bulls and bull dams are genotyped, and the bulls daughter yield deviations are analyzed by the algorithm of Fernando and Grossman (1989). Thus, these algorithms only include equations for bulls and bull dams. Meuwissen and Goddard (1999) proposed a method to include data from animals that were not genotyped, for the specific population structure of a nucleus breeding program in which all bulls are derived from the nucleus population and all individuals in the nucleus population, but none in the general population, are genotyped.
Goddard and Hayes (2007) considered theoretical aspects of genomic selection. The key features of this method are that markers covering the whole genome are used so that potentially all the genetic variance is explained by the markers and the markers are assumed to be in linkage disequilibrium with the QTL. Genome scans based on thousands of single nucleotide polymorphisms (SNP) covering the entire genome have been completed or are in progress for several dairy cattle populations (MacLeod et al., 2006; Schnabel et al., 2008). Implementation of MAS by this method also requires combining marker information with pedigree and phenotypic data. Goddard and Hayes (2007) consider 3 alternatives, the second of which is to infer marker genotypes for all animals and use these to calculate genomic EBV.
Israel and Weller (1998) proposed a model that assumes complete linkage between the QTL and a single marker and only 2 QTL alleles are segregating in the population. The QTL effect is then included in the complete AM analysis as a fixed effect. For individuals that are not genotyped, probabilities of receiving either allele are included as regression constants. Israel and Weller (2002) extended this method to a situation of a QTL bracketed by 2 genetic markers, based on the regression analysis method of Whittaker et al. (1996). Although the model was able to derive unbiased estimates of QTL on simulated results, the QTL effect was underestimated on real data, relative to alternative estimation methods (Weller et al., 2003). Meuwissen and Goddard (1999) also derived unbiased QTL estimates by their method on simulated data, but assumed that 20% of the individuals were genotyped.
Baruch and Weller (2008) investigated the effect described by Weller et al. (2003) by simulation, and found that bias increased as a function of the number of generations included in the analysis, as the fraction of animals genotyped decreased, and as QTL allelic frequencies became more extreme. They concluded that the main reason for this bias was due to confounding between the QTL effect and the polygenic effect as estimated via the relationship matrix. Bias increased as the fraction of animals in the population genotyped decreased, and as the number of generations included in the analysis increased. They proposed a modified cow model that does not account for relationships among animals, and using this model were able to derive unbiased estimates for the QTL effect on simulated data even when only a small fraction of the population was genotyped. Although this model was able to derive unbiased estimates of the QTL effect, it could not be used for ranking animals for selection, because it does not include the relationship matrix, and only cows with records were included. Baruch and Weller (2008) therefore proposed a 5-stage algorithm: The QTL substitution effect is estimated by the modified cow model, the phenotypic records are then adjusted by subtraction of the appropriate QTL effect for each animal, the adjusted records are then analyzed by a standard AM, the QTL effect of each animal is then added to the AM evaluation, and these evaluations are then used to rank animals for selection. They demonstrated, on simulated data, that this method was able to derive unbiased genetic evaluations and that genetic progress was increased relative to a standard AM.
The objectives of the current study were to examine the behavior of the proposed algorithm on a more complex genetic structure including 2 segregating QTL of differing magnitudes, to determine the quality of the QTL estimations under simulation conditions that more closely approximate reality, and to derive estimates of QTL effects by the proposed method on 2 QTL segregating in the Israeli Holstein population.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Production records were simulated as follows:
![]() |
where Yijk = lactation record for cow i in herd-year j, of parity k; ai = random additive genetic effect of cow i; pi = random permanent environmental effect of cow i; hj = fixed effect of herd-year j; q1i = fixed effect of QTL q1 genotype for cow i; q2i = fixed effect of QTL q2 genotype for cow i; and eijk = random residual.
All effects, other than q1 and q2 and polygenic effects for nonfounder animals, were simulated by selection from a normal distribution with a mean of zero. Variances were 0.2 for the polygenic effect among the founders, 0.125 for the permanent environmental effect, 0.25 for the herd-year effect, and 0.5 for the residual. Cows born each year were randomly assigned to 1 of 5 herd-years. We assumed that only 2 alleles were segregating in the population for each QTL, that QTL effects were codominant for both loci, and that the 2 QTL were unlinked and therefore distributed independently. Thus, the variance due to the QTL = 2p(1 – p)b2, where p = the frequency of one of the QTL alleles, and b is the QTL substitution effect. For q1 b = 0.5, and for q2 b = 0.32. Initial allelic frequencies were p = 0.5 for both QTL. Therefore, the initial variance due to q1 = 0.125, and the variance due to q2 = 0.0512. In this case, the total initial phenotypic variance, excluding the herd-year effect, was approximately 1, and heritability was approximately 0.375.
For the founders, a polygenic effect was generated by sampling from a normal distribution with a variance of 0.2. For all other animals, polygenic effects were generated as the sum of half the sire and half the dam polygenic effects plus a Mendelian sampling effect generated by sampling from a normal distribution with a variance of 0.1. Thus, the variance of the polygenic effect was A
0.2, where A is the relationship matrix and
denotes the Kronecker product. For founders, QTL genotype was determined by sampling twice (for each allele) from a uniform distribution. If a value <0.5 was obtained, then the individual was assumed to receive a positive allele, and otherwise the individual received the negative allele. Thus, the initial positive allelic frequencies were 0.5 for both QTL. For all other individuals, QTL genotype was determined by randomly selecting 1 allele from each parent with a frequency of 0.5.
Analysis Models and Selection Schemes
Scheme 1 was based on a standard AM, as follows:
![]() |
where gi is the fixed group effect, ml is the fixed parity effect, and the other terms are as described previously. Two groups were defined for founders; 1 for males and 1 for females. The algorithm used to compute elements of the inverse of the relationship matrix did not account for inbreeding, which should in any event be minimal, because only about 6 generations were included in the analyses.
Until yr 6, bulls to mate cows were selected at random among the founder bulls. At this year, the founder cows that were not culled completed their fifth-parity lactations, and AM genetic evaluations were computed for the first time. Because of the fixed parity effect, it was not possible to run the AM model unless some cows with fifth-parity records were included. Based on these evaluations, the 5 best bulls were randomly mated to the 80 best cows to produce potential bull calves. The remaining cows were randomly mated to the 20 best bulls including the 5 best that were not older than 3 yr. In both cases, bulls were selected from those bulls selected previously as mating bulls and all 1-yr-old bull calves, including the sons of the nonelite cows. Thus, this scheme differs from a traditional progeny test scheme in which only bulls that have been progeny tested on a sample of test daughters are mated to the general cow population. Bull calves not selected for breeding at the age of 1 yr were culled. Genetic evaluations of the 1-yr-old calves were based on pedigree. In simulations that utilized QTL information, this was also used to rank bull calves, as described below. Mating between bulls and cows were randomly determined within the elite and regular cow groups. Thus, except for the 5 best bulls, all other bulls selected for breeding would produce approximately the same number of offspring each year. After yr 6, genetic evaluations were computed yearly until the end of the simulation. Bulls selected for mating were not culled, and could be used for mating as long as their evaluations remained among the top 20 and they were not older than 3 yr for the general population, or among the top 5 with unlimited age for elite cows.
Scheme 2 was a MAS scheme. Only bulls used before yr 6 and male calves selected as potential AI sires were genotyped. For individuals that were genotyped, we assumed that all genotypes were determined without error; that is, complete correspondence between the genetic assay and QTL genotype. Probabilities of genotypes for all other animals were computed based on the algorithm of Kerr and Kinghorn (1996). The selection scheme was based on the following algorithm:
The cow model used to estimate the QTL effects was as follows:
![]() |
where ci = random effect of cow i; q1i and q2i are the inferred genotype probabilities for the 2 QTL, based on each cows genotyped ancestors; and the other terms are as described previously. Probabilities q1i and q2i were scored on a scale of 0 to 1, where 0 = homozygote for the "negative" QTL allele, and 1 = homozygote for the "positive" QTL allele. However, because only sires were genotyped, no female could have an inferred genotype of either 0 or 1. This model differs from the model of Israel and Weller (1998) in that only cows with production records are included, the cow effect includes both the polygenic and the permanent environmental effect, and covariances among cow effects are assumed to be zero; that is, the relationship matrix is not included. Although the cow effect includes the polygenic and permanent environmental effects, it does not include the QTL effects. Because the magnitude of these effects is not known, the cow model was first run under that assumption of QTL effects of zero. That is, the variance due to the cow effect = 0.5. Once estimates of the QTL effects were obtained, the sum of the variances due to the 2 QTL were deleted from the cow effect as follows:
![]() |
where
c2 = the variance of the cow effect, pi = the estimated frequency of the less frequent allele for QTL i, and bi = the estimated substitution effect for QTL i. In the first iteration of the algorithm, the cow model was run 3 times, with estimates of
c2 and bi updated at each iteration. In all following iterations, the cow model was run once using the previously estimated values for the QTL effects. Similarly for analysis of the population by the AM, the additive genetic effect should include only the polygenic effect, because QTL effects were subtracted. Therefore, the variance for the additive genetic effect was computed as 0.375 –
[pi(1 – pi)bi2], with the estimated substitution effects updated using the estimates from the most recent cow model evaluation. As noted previously, 0.375 is the initial total additive genetic variance, including the 2 segregating QTL.
Potential bull calves were not included in the animal model analysis. Adjusted breeding values (ABV) for bull calves were computed as follows:
![]() |
where PBV = polygenic breeding value; that is, the sire and dam EBV without the QTL effects.
Each simulation was continued to yr 30, and at this point, each pedigree included more than 3,000 bulls with progeny and about 37,000 cows with production records. Ten simulations were generated for each of the 2 schemes. The schemes were compared based on the mean genetic values of the populations at each year, and the mean breeding values. Mean population QTL values at each year were computed as: bi(2pij – 1), where pij = the estimated frequency of the positive allele for QTL i at year j. Mean estimates of the QTL effects by the MAS scheme were also computed.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
|
|
The proposed method is easy to apply, and all required software is already available. It is only necessary to genotype males, as is currently the case for SNP-based genome scans. The method is flexible with respect to model for general genetic evaluation, single-trait AM, multitrait AM, and so on. Unlike the model of Fernando and Grossman (1989), any number of genetic markers can be easily incorporated into the algorithm, although there might be a problem of over-parameterization if hundreds of markers are included, especially if markers are linked. Similarly, unlike the proposed algorithms of Mackinnon and Georges (1998), this algorithm does not require selecting young bulls with the positive genotype for all markers, and QTL and polygenic effects are weighted in accordance with selection index theory. In addition, estimates of QTL effects are updated each year, instead of relying on effects computed once from a single genome scan. Thus, the penalty for incorrect or inaccurate determination of segregating QTL will be minimal. Extension of the methods proposed in the current study could be applied to rank sires accurately including both marker and pedigree information for the large number of segregating QTL that will be detected by whole-genome scans (Goddard and Hayes, 2007), although, as mentioned previously, possible effects of over-parameterization would have to be considered. In the current study, only single-trait selection was simulated. Further studies will consider application of the proposed method to selection for a multitrait selection index, and selection based on linked markers.
| CONCLUSIONS |
|---|
|
|
|---|
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Received for publication February 27, 2008. Accepted for publication July 2, 2008.
| REFERENCES |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |