JDS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Leclerc, H.
Right arrow Articles by Ducrocq, V.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Leclerc, H.
Right arrow Articles by Ducrocq, V.
J. Dairy Sci. 89:1792-1803
© American Dairy Science Association, 2006.

Estimation of Genetic Correlations Among Countries in International Dairy Sire Evaluations with Structural Models

H. Leclerc*,{dagger},1, S. Minéry{ddagger}, I. Delaunay{dagger},§, T. Druet§, W. F. Fikse* and V. Ducrocq§

* Interbull Centre, Department of Animal Breeding & Genetics, SLU, Box 7023, Uppsala 75007, Sweden
{dagger} Union Nationale des Coopératives agricoles d’Elevage et d’Insémination Animale, Paris 75595, France
{ddagger} Institut de l’Élevage, Département de Génétique, INRA-SGQA, Jouy-en-Josas 78352, France
§ Station de Génétique Quantitative et Appliquée, INRA, Jouy-en-Josas 78352, France

1 Corresponding author: helene.leclerc{at}jouy.inra.fr


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
The increase in the number of participating countries and the lack of genetic ties between some countries has lead to statistical and computational difficulties in estimating the genetic (co)variance matrix needed for international sire evaluation of milk yield and other traits. Structural models have been proposed to reduce the number of parameters to estimate by exploiting patterns in the genetic correlation matrix. Genetic correlations between countries are described as a simple function of unspecified country characteristics that can be mapped in a space of limited dimensions. Two link functions equal to the exponential of minus the Euclidian distance between the coordinates of two countries and the exponential of minus the square of this Euclidian distance were used for the study on international simulated and field data. On simulated data, it was shown that structural models might allow an easier estimation of genetic correlations close to the border of the parameter space. This is not always possible with an unstructured model. On milk yield data, genetic correlations obtained from 22 countries for structural models based on 2 and 7 dimensions, respectively, were analyzed. Only a structural model with a large number of axes gave reasonable estimates of genetic correlations compared with correlations obtained for an unstructured model: 76.7% of correlations deviated by less than 0.030. Such a model reduces the number of parameters from 231 genetic correlations to 126 coordinates. On foot angle data, large deviations were observed between genetic correlations estimated with an unstructured model and correlations estimated with a structural model, regardless of the number of axes taken into account.

Key Words: genetic correlation matrix • structural model • international evaluation


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
The increasing international trade of genetic material from dairy cattle prompted the development of across-country bull comparisons, which led to the establishment of the International Bull Evaluation Service (Interbull, Uppsala, Sweden). Interbull provides international predicted breeding values for each bull on the scale of each participating country using a multiple-trait across country evaluation (MACE) approach (Schaeffer, 1994) with national genetic evaluation results as input data. With this approach, performance in each country is considered a different trait, allowing for different genetic parameters in different countries and genetic correlations among countries less than unity. The estimation of genetic correlations has become a challenge with the increase in the number of participating countries and the lack of genetic links between some of them. For instance, in May 2002, among the 351 genetic correlations across 27 populations (hereafter referred to as countries) for milk yield in the Holstein breed, 124 could not be directly estimated (Interbull, 2005a) and were assigned according to similarities between production systems. These correlations ranged from 0.86 to 0.89 between two northern hemisphere countries, and from 0.75 to 0.78 between a northern and a southern hemisphere country (Interbull, 2005b).

In international dairy sire evaluation, traits are currently defined according to country borders, even though the underlying trait (e.g., milk yield) is quite similar for all countries. Thus, the expression of this trait in different countries tends to be highly correlated. To avoid computational difficulties, structural models are often proposed as an alternative to the classical approach (Rekaya et al., 2001; Delaunay et al., 2002). Here, the basic idea behind structural models is to describe the full genetic covariance matrix as a function of fewer parameters. Rekaya et al. (2001) suggested the use of external information to characterize production systems in different regions or countries to describe genetic covariances. However, the use of external information on climate conditions, management practices, and genetic composition of the cow population to measure similarities across regions or countries is ambiguous due to lack of uniformity in recording such information across countries. In the structural model proposed by Delaunay et al. (2002) as a part of the Production Traits European Joint Evaluation project (Canavesi et al., 2002), genetic correlations between countries are described as a simple function of unspecified country characteristics that can be mapped in a space of limited dimensions. The link function used by Delaunay et al. (2002) to define the correlation between two countries was the exponential of minus the Euclidian distance between the coordinates of two countries. However, there was some concern about the fact that the use of distances imposed important constraints, because not all correlation matrices can be described with such an approach (Delaunay et al., 2002; Minéry et al., 2003; M. E. Goddard, Univ. Melbourne, Australia, personal communication).

The objective of this study was to present and test the structural model of Delaunay et al. (2002) and a variant of it in the context of international dairy sire evaluations on simulated data and field data for different levels of correlations between countries.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Data
Simulated Data.
Two hypothetical international dairy cattle populations were created. The first was simulated assuming equal genetic correlations among 4 countries, whereas the second one assumed varying genetic correlations among 8 countries (representing a more realistic situation). Designs of populations were made to obtain reasonable computation time and approximately the same number of cows in both populations: 51,200 and 48,640 in the first and second populations, respectively. The number of nonoverlapping generations was reduced from 5 in the first population to 4 in the second one. The characteristics of these populations are described in Table 1Go. The genetic correlations used will be presented later. Populations were of equal size per country and per generation. Parents of the next generation were selected across countries based on the results of a MACE at each generation, for which the true genetic parameters were used. To ensure a strong connectedness among countries, each young or proven bull was used for mating in each country. At each generation, 40 and 24 young bulls were progeny-tested based on 24 and 30 female offspring in each country, respectively (i.e., 96 and 240 daughters per young bull). Of these young bulls, the 16 and 8 highest ranking bulls were used for all countries as proven bulls in the next generation and the 8 and 3 highest ranking bulls among the proven bulls were selected as bull sires, respectively. Proven bulls that passed the progeny test had 100 additional female offspring in each country in the next generation. Within each generation, 40 and 24 elite dams were selected across country as bull dams and each one was mated to one bull sire to produce a progeny-tested bull for the next generation. Each cow was mated to either a proven or a young bull to produce one female offspring for the next generation. Mating of selected animals was at random. The cow performance was simulated according to the model:


View this table:
[in this window]
[in a new window]
 
Table 1. Number of countries, generations, animals per generation, and total animals in 2 simulated populations
 

Formula 1[1]

where yijk is the simulated performance of cow i in herd k of country j; bk is the herd effect (25 herds per country and per generation) with bk ~ N(0,1); aij is the true additive genetic value of cow i in country j with Formula 1 being the cow’s Mendelian sampling, a ~N(0,G {otimes} A) where G is the across-country genetic (co)variance matrix and A is the additive genetic relationship matrix between animals; and eijk is the residual with e ~ (0,I{sigma}ej2) where {sigma}ej2 is the residual variance for country j. For all countries, genetic and residual variance were fixed such as {sigma}a2 = 0.25 and {sigma}e2 = 0.75, that is, a resulting heritability value of 0.25.

Field Data.
Field data were used to assess the generality of the results obtained on simulated data for which the populations’ structure was not realistic. Data available were the deregressed national breeding values of the bulls used for the Holstein international genetic evaluation. Milk yield analyses were based on data of August 2003, and type trait analyses (foot angle) were based on data of November 2003. The deregression procedure was done by Interbull from the national breeding values sent by the participating countries. It removed the double counting of effects that subsequently were included in the prediction of international breeding values (Jairath et al., 1998). Each deregressed breeding value was weighed in the analyses by their effective daughter contribution (EDC), which considers contemporary group size, correlations between repeated records, and the reliability of the daughter’s dam evaluation (Fikse and Banos, 2001). These EDC were sent by each country to Interbull.

Data were edited to include only national evaluations for bulls born after 1984. All the observations were included in the estimation of genetic correlations, in contrast with the current Interbull practice based on subsets of well-connected bulls.

Data from 22 Interbull member countries (Australia, Belgium, Canada, Czech Republic, Denmark, Estonia, Finland, France, Germany, Hungary, Ireland, Israel, Italy, New Zealand, Poland, Spain, South Africa, Switzerland, Switzerland Red, The Netherlands, United Kingdom, and United States) were used for the milk production study. These data sets are characterized in Table 2Go. The selected countries represented a wide range of production systems and environments. Links between countries were variable, with a number of bulls with daughters in 2 countries (hereafter referred to as common bulls) ranging from 0 (e.g., Estonia–Finland) to 772 (Canada–United States). For foot angle, data from 8 countries were selected (Australia, Canada, France, Germany, Italy, The Netherlands, United Kingdom, and United States). These data sets are characterized in Table 3Go.


View this table:
[in this window]
[in a new window]
 
Table 2. Number of bulls per country1 for milk production traits with a national evaluation (diagonal) and common bulls that had progeny in each country pair (above the diagonal)
 

View this table:
[in this window]
[in a new window]
 
Table 3. Number of bulls per country1 for foot angle trait with a national evaluation (diagonal) and common bulls that had progeny in each country pair (above the diagonal)
 
International Evaluation
Model for Simulated Data.
Variance components were estimated applying the same model as for simulation [1], and cow performance records were used as input data.

Model for Field Observations.
For the variance components estimation, the sire model used in the international genetic evaluations was applied (Schaeffer, 1994). In this MACE, traits in different countries are considered different traits that are genetically correlated. The linear model used was


Formula 2[2]

where yi is the vector of deregressed breeding values of bulls for country i, µi is the mean for country i, gi is the vector of genetic groups effects treated as fixed for the estimation of genetic correlations between countries, si is the vector of random sire transmitting abilities for country i, ei is the vector of random residuals, Zi is the sire incidence matrix, and Q is the matrix assigning sires to genetic groups.

For t countries, the variance-covariance matrices of the random effects are


Formula 2


Formula 2

where Di is a diagonal matrix with EDC weighting factors (Fikse and Banos, 2001) as elements, {sigma}ei2 is the residual variance for country i, A is the additive relationship matrix based on sire, maternal grand sire, and maternal grand-dam of the bull, with maternal grand-dam treated as missing and assigned to phantom parent groups, {sigma}si2 is the sire variance for country i, and {sigma}sij is the sire covariance between countries i and j.

Genetic groups for missing ancestors were formed, based on selection path, year of birth, and origin (defined following country borders). Small groups were merged together first by year of birth and then by origin to achieve a minimum group size of 500 bulls with unknown parents for the estimation of genetic correlations across countries.

Models for the Genetic Covariances.
Three different models were used for the genetic variance-covariance matrix: a classical model (CM), the structural model of Delaunay et al. (2002) referred to as SM(dXY), and a structural model derived from that of Delaunay et al., referred to as SM(dXY2).

In the classical model, the variance-covariance matrix was assumed unstructured. In the structural model of Delaunay et al. (2002), the covariances between countries were defined as a function of a set of unobserved variables (characteristics) for each country that condition the genetic correlations between countries. The country characteristics are represented in a space of k dimensions (k < number of countries), in which the coordinates of the countries are the unobserved characteristics. In this space, the genetic correlation between 2 countries, X and Y, (rGXY) was defined as


Formula 3[3]

where dXY is the Euclidian distance between countries X and Y, computed as Formula 3 with PXi and PYi being the coordinates of countries X and Y, respectively, for axis i. According to this definition, the covariance between 2 countries X and Y is {sigma}XY = {sigma}X ·{sigma}Y·exp(–dXY) with {sigma}X and {sigma}Y being the genetic standard deviations in countries X and Y, respectively.

To illustrate, consider 4 countries (A, B, C, and D). Their characteristics can be used to conceptually define the axes of a 3-dimensional space (Figure 1Go). They are referred to as axis countries. Country A defines the center of the space. Adding country B determines the first axis. The inclusion of countries C and D position the second and third axes, respectively. For example, the correlation between countries B and C (rGBC) computed from their coordinates is Formula 3 In this space, a fifth country, E, could be added without contributing to the definition of the space. Country E is referred to as an added country. For country E, only 3 coordinates need to be estimated to determine the 4 genetic correlations with the axis countries. Similarly, 3 additional coordinates need to be estimated for an added country F to determine the 5 additional genetic correlations (4 with the axis countries and one with country E). With this structural model, even if countries E and F do not have any direct link between them, the genetic correlation between them can be computed from the Euclidian distance dEF obtained from their coordinates in the 3-dimensional space by Formula 3


Figure 1
View larger version (17K):
[in this window]
[in a new window]
 
Figure 1. Geometrical representation of a structural model for 6 countries in a 3-dimensional space. The genetic correlation between 2 countries; e.g., E and F is assumed to be rGEF = exp(–dEF) or rGEF = exp(–dEF 2).

 
In a second structural model derived from that of Delaunay et al. (2002), the genetic correlation between countries X and Y was defined as


Formula 4[4]

The use of the square of the Euclidian distance (dXY2) allowed more flexibility in the model. It solved many of the cases where "triangular inconsistencies" occurred (Delaunay et al., 2002; Minéry et al., 2003; M. E. Goddard, Univ. Melbourne, Australia, personal communication). As an example, take 3 countries, A, B, and C. If rGAB = 0.90, rGAC = 0.70, and rGBC = 0.80, it is not possible to find country coordinates such that rGXY = exp(–dXY). Indeed, on the "distance scale", it is impossible to have, at the same time, dAB = 0.105, dAC = 0.357, dBC = 0.223, because dAB + dBC < dAC. With rGXY = exp(–dXY2), dAB = 0.325, dAC = 0.597, dBC = 0.472, and the reparameterization is possible (dAB + dBC > dAC). Another obvious restriction is that correlations must be between 0 and 1 in both cases.

The use of these structural models makes it possible to reduce the number of parameters to estimate for the genetic covariance matrix from Formula 4 to Formula 4 where m is the number of countries and k the 2 number of axes.

In the rest of this paper, CMm will represent a classical model used to estimate the genetic correlations among m countries. The terms SM(dXY)km and SM(dXY2)km will represent structural models for which the correlations among m countries are estimated based on the country coordinates in a space of dimension k, with the genetic correlations defined as in [3] and [4] respectively.

Algorithm.
An average information-REML (AI-REML) algorithm was used for parameter estimation (Johnson and Thompson, 1995). The main advantages of AI-REML are that it converges faster than the expectation maximization-REML algorithm used by Interbull and it provides asymptotic standard errors of the estimates, obtained as the inverse of the AI matrix. For the first population of simulated data, the ASREML software (Gilmour et al., 2002) was used for the estimation of parameters of both the classical model and the structural model SM(dXY). For the second population of simulated data and field data analyses, the AI-REML algorithm implemented by Druet et al. (2003a, b), which allows the user to define parametric structures for the random effects, was used for the estimation of the parameters. For the structural models, the genetic variance-covariance matrix was a nonlinear function of parameters (i.e., the coordinates), so the AI-REML algorithm used a simplified AI matrix, ignoring nonzero terms of the second derivative of the genetic (co)variance matrix (Gilmour et al., 1995).

In the ASREML software and in the AI-REML software used to analyze the field data, the update of the parameters was based on a line search procedure in which the step size was repeatedly divided by 2 until the likelihood increased (Dennis and Schnabel, 1983). For the classical model with field data, the update of the genetic covariance matrix was a combined AI-EM update if the AI-REML update alone lead to a nonpositive definite genetic covariance matrix (Jensen et al., 1996).

Some parameters could be forced to remain constant during the iteration process by setting to zero the first derivatives of the likelihood with respect to these parameters. By fixing coordinates for the axis countries, the coordinates for other countries estimated in different runs were relative to the exact same space.

Model Comparison.
For the simulated data, the structural and the classical models were compared with the true genetic correlations and on the basis of minus twice the logarithm of the likelihood (–2logL). For the field data, the structural and the classical models were compared on the basis of the estimated genetic correlations, of minus twice the logarithm of the likelihood (–2logL), and of 2 information criteria that take the number of parameters to estimate into account: the Akaike’s information criterion (AIC; Akaike, 1974) and the Schwarz’s Bayesian information criterion (BIC; Schwarz, 1978):


Formula 4


Formula 4

where q is the number of parameters, n is the number of observations, and p is the rank of fixed effects matrix computed as p = (number of genetic groups + 1) x number of countries. Although not necessarily the most accurate ones, the results obtained with the classical model were used as reference.

Analyses
Simulated Data.
For the first population, genetic correlations between countries estimated for the structural model SM(dXY) taking between 1 and 3 axes into account and a classical model were compared with the true values of genetic correlations. A first data set was simulated assuming a genetic correlation between all countries of 0.90; whereas in a second set, a genetic correlation between countries of 0.99 was simulated. This extreme correlation value makes it possible to check the ability of the structural models to deal with the estimation of genetic parameters very close to the border of the parameter space. For the second population, the data were simulated in such a way that genetic correlations between countries ranged from 0.66 to 0.97 (Table 4Go). Genetic correlations between countries estimated for the structural models SM(dXY) and SM(dXY2) taking between 1 and 7 axes into account and a classical model were compared with the true values of genetic correlations. No replications of either scenario were made.


View this table:
[in this window]
[in a new window]
 
Table 4. Genetic correlations among 8 countries (Co 1 to Co 8) considered for the second simulated population
 
Field Data: Determination of Axis Countries.
Two traits with various degrees of resemblance between countries were studied: milk yield, as an example of traits that are very similar across countries (i.e., with an average genetic correlation estimated at 0.88 in the international evaluation of August 2003; Interbull, 2005a), and foot angle, to represent a more variable trait; that is, with an average genetic correlation between 2 countries estimated at 0.65 (range 0.06 to 0.91) in the international evaluation of November 2003 (Interbull, 2005c). Firstly, the different models were compared based on a subset of well-connected countries (i.e., 8 or 9 countries having a large number of common bulls). This preliminary step made it possible to determine how many axes were needed to obtain accurate estimates of genetic correlations while reducing the number of parameters. Only countries with strong genetic links, measured by the large number of common bulls (Tables 2Go and 3Go) between each other and with the remaining countries, were chosen as axis countries. Such links facilitate the estimation of genetic correlations between countries with weak ties, utilizing indirect links provided by the axis countries. Moreover, the axis countries should be selected to represent a wide range of production environments. In practice, at least one well-connected country of each hemisphere was included as an axis country (United States for the northern hemisphere and Australia or New Zealand for the southern hemisphere). For milk yield, the first 2 axes were always defined by Germany–United States–New Zealand based on the results of Minéry et al. (2003). For milk yield, the axis countries for dimension 3 and higher were chosen to maximize the volume V of the space defined by their coordinates, where Formula 4, if O is the center of the space, K, L, and M define the first, second, and third axis, respectively, as suggested by Minéry (2003). The increase of the number of axes was expected to give more flexibility to the model.

Field Data: Complete Data.
For milk yield, 2 specific structural models with a fixed number of axes were selected based on the results from the 9 well-connected countries: one for SM(dXY) and one for SM(dXY2). Genetic correlations estimated with the structural models were compared with estimates obtained for a classical model when all 22 countries described in Table 2Go were included. Due to memory constraints, it was not possible to estimate correlations between more than 10 countries per run. Therefore, subsets including between 4 and 10 countries per run were created to estimate all correlations with CM. At least 1 or 2 countries providing many links, such as France, Germany, The Netherlands, and United States, were used in each subset. Only 211 of 231 genetic correlations could be computed with CM from the different subsets. For the 211 estimated country pairs, the number of estimates per country pair ranged from 1 (e.g., Czech Republic–Finland, Israel–Spain) to 28 (New Zealand–United States) with, on average, 3.4 estimates per country pair. For country pairs with several estimates, we used the average genetic correlation. For missing genetic correlations (20 country pairs out of 231), which were not considered in this analysis, most of them involved countries with weak links; that is, Estonia, Finland, Israel, South Africa, and Switzerland Red.

To estimate the coordinates for the structural models, different subsets of countries were considered in addition to the axis countries. Here, the coordinates for axis countries were fixed. Therefore, the coordinates of other countries were estimated in the exact same space and were not influenced by small variations of coordinates for axis countries that were observed when the space was not fixed (Minéry, 2003). With the coordinates of all countries in the space defined by the axis countries, it was possible to compute the distance between all pairs of countries, and thus their genetic correlations.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
Simulated Data
As expected for the first population (Table 5Go), when all simulated genetic correlations between countries were 0.90, CM4 and SM(dXY)34, which presented the same number of parameters, gave equal likelihood and average absolute deviations from the true values. When the true genetic correlations were 0.90, SM(dXY)14 gave correlations outside the parameter space. This structural model has only one dimension, and it was not possible to obtain a spatial representation of such genetic correlations on a straight line, because all countries should be equidistant to each other. When genetic correlations of 0.99 were simulated, the unstructured model (CM4) failed to converge. The AI-REML algorithm converged to a nonpositive definite genetic covariance matrix and the results were discarded. All structural models tested gave consistent results with an average absolute deviation to the true value equal or lower than 0.005.


View this table:
[in this window]
[in a new window]
 
Table 5. Number of parameters to estimate the genetic covariance matrix, the log likelihood, and average absolute deviations of genetic correlations (rG) estimated based on simulated data for a classical model (CM4) and 3 structural models SM(dXY)k4 for 4 countries (k = 1 to 3 axes) from the true genetic correlations
 
For the second population (Table 6Go), the iteration process of the unstructured model took 5 times longer than it took for SM(dXY2)48 to obtain similar correlations, and the likelihood was lower than the one obtained for some structural models. Average absolute deviations obtained for the structural models to the true correlations were around 0.020, as for the unstructured model. Surprisingly, SM(dXY2)58 presented the highest likelihood obtained for this population, indicating that CM may not have reached convergence and that SM(dXY2)68 and SM(dXY2)78 may have converged to a local maximum.


View this table:
[in this window]
[in a new window]
 
Table 6. Number of parameters to estimate the genetic covariance matrix, minus twice the log likelihood (–2logL) Akaike’s information criterion (AIC), and average absolute and distribution of deviations (rG) of genetic correlations estimated based on simulated data for a classical model (CM8) and structural models SM(dXY)k8 and SM(dXY2)k8 for 8 countries (k = 1 to 7 axes) from the true genetic correlations
 
Milk Yield Field Data
Determination of Axis Countries.
The 9 well-connected countries selected were Australia, Canada, France, Germany, Italy, New Zealand, The Netherlands, United Kingdom, and United States. Strong links allowed an accurate and precise estimation of the genetic correlations using a CM. The lowest number of common bulls was 152 between France and New Zealand (Table 2Go). These countries were expected to define a stable space into which the remaining countries could be mapped.

For SM(dXY), only the models including at most 4 axes gave consistent results; that is, where the increase of the space dimension led to an increase of the likelihood (Table 7Go). Model SM(dXY)29 appeared to be the most interesting compromise between the accuracy of estimated genetic correlations and the reduction of the number of parameters compared with the CM9 results (Table 8Go). The lower likelihood observed for SM(dXY)29 was compensated by the reduction of parameters and led to the lowest BIC. Interestingly, the correlations estimated with this model did not deviate substantially more from the CM9 estimates than SM(dXY) with 3 or 4 dimensions (Table 8Go). All deviations of correlations larger than 0.030 were for pairs of countries from the southern and northern hemispheres.


View this table:
[in this window]
[in a new window]
 
Table 7. Number of parameters to estimate the genetic covariance matrix for milk yield trait, minus twice the log likelihood (–2logL) obtained for the structural models SM(dXY)k9 and SM(dXY2)k9 for 9 countries (Australia, Canada, France, Germany, Italy, New Zealand, The Netherlands, United Kingdom, and United States) with models taking from 2 to 8 axes (k) into account, in comparison with the classical model (CM9).
 

View this table:
[in this window]
[in a new window]
 
Table 8. Number of parameters to estimate the genetic covariance matrix for milk yield trait, minus twice the log likelihood, (–2logL), Akaike’s and Schwarz’s Bayesian information criteria, and average, maximum and distribution of deviations of genetic correlations for three SM(dXY)k9 and two SM(dXY2)k9 with models taking from 2 to 4 axes (k) and from 6 to 7 axes, respectively, into account for 9 countries (Australia, Canada, France, Germany, Italy, New Zealand, The Netherlands, United Kingdom and United States) in comparison with the classical model (CM9)
 
For SM(dXY2), the results with the 9 well-connected countries (Table 7Go) were only reasonable, in terms of likelihood, for models having at least 6 axes. For the other situations, the likelihoods obtained for SM(dXY2) were lower than for SM(dXY) with the same dimension; for instance, –2logL was higher for SM(dXY2)49 than for SM(dXY)49 (+175.7). The likelihood and genetic correlations obtained for SM(dXY2)89 were the same as for CM9. In contrast with what Minéry (2003) obtained with SM(dXY), it seems that the triangular restriction is less severe for SM(dXY2) than for SM(dXY). The most interesting model appeared to be SM(dXY2)79. Genetic correlations estimated by SM(dXY2)79 were not very different from those estimated with CM9 (Table 8Go); only 2 deviated by more than 0.030 in absolute value (0.042 for Germany–New Zealand and 0.045 for Canada–New Zealand). Although the reduction of the number of parameters to estimate is negligible in that case, it should be remembered that the final goal is to use these 8 axis countries to find the position in the space spanned of the 14 other countries. Therefore, the overall reduction of the number of parameters is significant.

Complete Data.
Genetic correlations among all countries estimated with SM(dXY)7m were on average closer to the CM correlations than those estimated with SM(dXY)2m (Figure 2AGo). With SM(dXY2)7m, 76.7% of the correlations deviated by less than 0.030 from the CM estimates, whereas this proportion was only 47.8% for SM(dXY)2m. Both models presented similar extreme deviations: –0.290 for SM(dXY)2m and –0.241 for SM(dXY2)7m, which were for pairs of countries with weak links to each other: Poland–Switzerland (30 common bulls) and Hungary–Israel (17 common bulls), respectively.


Figure 2
View larger version (26K):
[in this window]
[in a new window]
 
Figure 2. Percentage of deviations for milk yield trait between genetic correlations computed based on coordinates estimated for structural model SM(dXY)2m (i.e., model taking 2 axes into account; gray bars) and SM(dXY2)7m (i.e., model taking 7 axes into account; black bars) compared with the corresponding correlations estimated with classical model (CM) for 22 countries (panel A). A distinction was made between correlations estimated among axis countries or between axis and nonaxis countries (panel B) and correlations among nonaxis countries computed from coordinates estimated in different runs (panel C).

 
When a distinction was made between correlations estimated among axis countries or between axis and nonaxis countries and correlations computed from coordinates estimated in different runs, the pattern of deviations differed from above (Figure 2B and 2CGo). For correlations estimated among axis countries or between axis and nonaxis countries, 93.1% deviated by less than 0.030 from the CM correlations for SM(dXY2)7m, compared with only 66.7% for SM(dXY)2m. However, correlations among nonaxis countries computed from the coordinates estimated in different runs showed large deviations for both structural models. The average absolute deviations were 0.050 and 0.055 for SM(dXY2)7m and SM(dXY)2m, respectively, and 60.0% and 66.9% of correlations deviated by more than 0.030 from CM estimates.

Foot Angle Field Data.
The 8 well-connected countries selected were Australia, Canada, France, Germany, Italy, The Netherlands, United Kingdom, and United States. As for milk yield results, both structural models were compared using a model including a limited number of axes [3] and a model including a large number of axes (7). Model SM(dXY2)78 had the same number of parameters to estimate as CM8 but it did not give values of –2logL, AIC, and BIC close to those of CM8 (+52.4). It was also noticed that the likelihood of SM(dXY2)78 was even lower than the one of SM(dXY)38, indicating computational inconsistency. For both structural models, it was found that different sets of starting values could lead to very different likelihood values, corresponding to different local maxima. For instance, with 4 sets of starting values, we obtained 4 different values of –2logL at convergence with SM(dXY)45 (ranging from +29.6 to +74.9 relative to –2logL for the classical model CM5).


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
An unstructured model (CM) was used as reference to assess the quality of genetic correlation estimates using two structural models. However, CM estimates do not represent the true values, the most accurate ones. Unstructured models also present some limitations. The number of parameters to estimate is large, so the precision of each parameter estimated is lower on average than with a parsimonious model such as the two structural models used here. Moreover, it is necessary to create subsets of countries to estimate CM genetic correlations with a large number of countries, because of memory constraints. This is why estimation of genetic correlations using an unstructured model is a time-consuming step in international genetic evaluation. The use of the structural models proposed here for the estimation of genetic correlations seems interesting in terms of reduction of number of runs, when dealing with a large number of countries. For instance, in the application to 22 countries, the 231 genetic correlations estimated for a classical model can be derived from the 126 coordinates estimated with a structural model defined with 7 axes [SM(dXY2)7m] and from only 41 coordinates with a structural model defined with 2 axes [SM(dXY)2m].

The agreement between SM and CM correlation estimates was rather disappointing. The use of SM(dXY2)7m or SM(dXY)2m led to deviations with respect to CM of more than 0.030 for 23 and nearly 50% of the correlations, respectively. A more detailed examination showed that between 70 and 80% of the large deviations of these estimates (larger than 0.030) were for correlations computed among nonaxis countries. The accuracy of the calculated genetic correlations among nonaxis countries were much lower than the ones estimated among axis countries or between axis and nonaxis countries, and were lower than expected based on a previous study (Minéry et al., 2003).

According to the analysis of simulated data, the structural model allows the estimation of genetic correlations that are very close to the border of the parameter space, which was not possible with an unstructured model. The structural model intrinsically involves a restriction on correlations that is not included in the unstructured model. This study only considers the comparison of the results obtained from structural models against methods used by Interbull (AI-REML or expectation maximization-REML algorithm). Very high genetic correlations (e.g., larger than 0.95) are common in international evaluations, especially with traits very similar across countries such as milk, fat, and protein yield. Unfortunately, the structural models used in this study did not perform well for traits moderately correlated such as foot angle, whose definition varies considerably across countries (e.g., in Switzerland, another trait, heel depth, is used as a measure for foot angle). Low correlations lead to country coordinates that are further apart, exacerbating problems of triangular inconsistency.

The accuracy of genetic correlation estimates largely depended on the dimension of the parameter space defined by the axis countries. For milk yield field data, genetic correlations between countries were estimated quite accurately with a structural model SM(dXY) in examples involving a low number of axis countries. However, when a larger space was considered, including more axis countries to add more flexibility to the model, serious convergence problems appeared and correlations estimated with SM(dXY) deviated substantially from correlations estimated with the classical model. At least some of these problems are related to the evidence of local maxima, due to strong geometrical constraints or flat likelihood profiles. Such problems exist with SM(dXY2) even though geometrical constraints are lower. These problems indicate that the triangular inconsistency should not be overlooked. The constraints have consequences on the maximization procedure, with different maxima reached depending on starting values. In contrast, for SM(dXY2), the estimates were mostly accurate only with models involving a large number of axis countries. In this case, the reduction of the number of parameters to estimate is not as large as hoped (Minéry et al., 2003).

The choice of axis countries is another important issue for our structural models. The axis countries should represent most of the production systems that exist in the participating countries. The choice of axis countries was based on maximization of the volume of space defined by the coordinates of the axis countries. However, in view of the disappointing results obtained for some countries (e.g., Czech Republic, Estonia, Poland), it seems that some of the 22 countries were not correctly represented in the space defined by the 3 [for SM(dXY)2m] or 8 [for SM(dXY2)7m] well-connected countries that were chosen as axis countries. The motivation for working with a preliminary subset of 9 well-connected countries was to provide, through these countries, indirect links to the other countries, but it seems that this is not sufficient. Therefore, a compromise should be found between the amount of links between countries and their representativeness.

One of the expected advantages of our structural models was that only the coordinates of the country in the space defined by axis countries were needed to estimate correlations with all the other countries. This is an attractive property when a new country wants to join international evaluation or one of the other countries changes something in its genetic evaluation, because structural models avoid time-consuming estimation. From this perspective, structural models with a large number of axes (e.g., 8 countries), although preferred for their higher accuracy, are not appealing from a practical point of view because the probability that one of these countries modifies its evaluation is high, which means that the considered space is modified and all coordinates need to be reestimated.

The use of Euclidian distance to define our structural models imposes obvious restrictions because only positive correlations can be estimated. Alternative link functions exist to encompass the correlation range from –1 to 1 [e.g., Formula 4 or rGXY = 2exp(–dXY)
– 1]

The geometrical restrictions imposed by the use of the Euclidian distance are strong. It can be shown that a similar structural model could be implemented to completely remove these constraints, enlarging the space to complex numbers. Unfortunately, in that case, the genetic correlation matrix is no longer ensured to be positive definite. However, as the number of participating countries to international evaluation is likely to keep growing, it seems obvious that unstructured correlation matrices are far from optimal. There are other structural models, such as models based on principal components or factor analysis, for which promising results have been obtained (Leclerc et al., 2005).


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 
The structural model SM(dXY2) seems, as expected, to be less affected by geometrical restrictions than SM(dXY), but such restrictions still exist. The model SM(dXY2) was able to explain genetic correlations between countries better than SM(dXY) but required a larger number of axes to obtain estimates of correlations close to those obtained for a classical model. The benefits in terms of reduction of the number of parameters were thus lower than expected and numerical problems to find a global maximum were encountered.

It is concluded that the structural models envisioned here are mainly interesting to deal with cases where correlation estimates are near the border of the parameter space, to get reasonable genetic correlations for countries with limited links with most of the others.

Received for publication June 28, 2005. Accepted for publication November 23, 2005.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 CONCLUSIONS
 REFERENCES
 


Akaike, H. 1974. A new look at the statistical model identification. IEEE Trans. Automatic Control 19:716–723.

Canavesi, F., D. Boichard, V. Ducrocq, N. Gengler, G. De Jong, and Z. Liu. 2002. An alternative procedure for international evaluations: PROduction Traits European Joint Evalaution (PROTEJE). Proc. 7th World Congr. Genet. Appl. Livest. Prod., Montpellier, France. Communication 01–59.

Delaunay, I., V. Ducrocq, and D. Boichard. 2002. A structural model for the matrix of genetic correlations between countries in international evaluation. Proc. 7th World Congr. Genet. Appl. Livest. Prod., Montpellier, France. Communication 01–14.

Dennis, J. E., and R. B. Schnabel. 1983. Numerical methods for unconstrained optimization and nonlinear equations. Prentice-Hall, Englewood Cliffs, NJ.

Druet, T., F. Jaffrézic, D. Boichard, and V. Ducrocq. 2003a. Modeling lactation curves and estimation of genetic parameters for first lactation test-day records of French Holstein cows. J. Dairy Sci. 86:2480–2490.[Abstract/Free Full Text]

Druet, T., F. Jaffrézic, and V. Ducrocq. 2003b. Estimation of genetic parameters of test day records for milk yield for the three first lactations of French Holstein cows. Proc. 54th Annu. Mtg. Eur. Assoc. Anim. Prod., Roma, Italy. Communication G5–08.

Fikse, W. F., and G. Banos. 2001. Weighting factors of sire daughter information in international genetic evaluations. J. Dairy Sci. 84:1759–1767.[Abstract]

Gilmour, A. R., R. Thompson, and B. R. Cullis. 1995. Average Information REML: An efficient algorithm for variance parameters estimation in linear mixed models. Biometrics 51:1440–1450.

Gilmour, A. R., R. Thompson, B. R. Cullis, and S. J. Welham. 2002. ASREML estimates variance matrices from multivariate data using the animal model. Proc. 7th World Congr. Genet. Appl. Livest. Prod., Montpellier, France. Communication 28–05.

Interbull. 2005a. Genetic evaluations–Production–August 2003–Appendix I – Holstein Milk. http://www-interbull.slu.se/eval/framesida-prod.htm Accessed Jan. 18, 2005.

Interbull. 2005b. Genetic evaluations–Production–May 2002. http://www-interbull.slu.se/eval/framesida-prod.htm Accessed Jan. 18, 2005.

Interbull. 2005c. Genetic evaluations–Conformation–November 2003–Appendix IV – Foot Angle. http://www-interbull.slu.se/conform/framesida-conf.htm Accessed Jan. 18, 2005.

Jairath, L., J. C. M. Dekkers, L. R. Schaeffer, Z. Liu, E. B. Burnside, and B. Kolstad. 1998. Genetic evaluation for herd life in Canada. J. Dairy Sci. 81:550–562.[Abstract]

Jensen, J., E. A. Mäntysaari, P. Madsen, and R. Thompson. 1996. Residual maximum likelihood estimation of (co)variance components in multivariate mixed linear models using average information. J. Ind. Soc. Agric. Stat. 49:215–236.

Johnson, D. L., and R. Thompson. 1995. Restricted maximum likelihood estimation of variance components for univariate animal models using sparse matrix techniques and an average information. J. Dairy Sci. 78:449–456.[Abstract]

Leclerc, H., W. F. Fikse, and V. Ducrocq. 2005. Principal components and approximate factorial approaches for estimating genetic correlations among countries in International dairy sire evaluation. J. Dairy Sci. 88:3306–3315.[Abstract/Free Full Text]

Minéry, S. 2003. Application of a structural model to estimate genetic correlations between countries. M.Sc. Thesis, Inst. Nat. Agron. Paris-Grignon, Paris, France.

Minéry, S., W. F. Fikse, and V. Ducrocq. 2003. Application of a structural model to estimate genetic correlations between countries. Interbull Bull. 31:175–179.

Rekaya, R., K. A. Weigel, and D. Gianola. 2001. Application of a structural model for genetic covariances in international dairy sire evaluations. J. Dairy Sci. 84:1525–1530.[Abstract]

Schaeffer, L. R. 1994. Multiple-country comparison of dairy sires. J. Dairy Sci. 77:2671–2678.[Abstract]

Schwarz, G. 1978. Estimating the dimension of a model. Ann. Stat. 6:461–464.



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Leclerc, H.
Right arrow Articles by Ducrocq, V.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Leclerc, H.
Right arrow Articles by Ducrocq, V.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS