|
|
||||||||



* Department of Large Animal Sciences, Faculty of Life Sciences, University of Copenhagen, Grønnegårdsvej 8, 1870 C, Denmark
Interbull Centre, Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Box 7023, 75007 Uppsala, Sweden
Canadian Dairy Network, Guelph, Ontario, Canada N1G 4T2
Animal Improvement Programs Laboratory (AIPL), Agricultural Research Service, USDA, Beltsville, MD 20705-2350
1 Corresponding author: thm{at}life.ku.dk
| ABSTRACT |
|---|
|
|
|---|
Key Words: genetic correlation international genetic evaluation udder health prior information
| INTRODUCTION |
|---|
|
|
|---|
Genetic correlations (rG) between the missing trait and available indicator traits are required to obtain predictions for missing traits, but such rG are usually not readily available. However, a missing trait may be available in a country other than the one of interest. If the missing trait is systematically recorded and evaluated in a foreign country, then rG may be predicted from multiple regression models using various explanatory variables available in both the country where the trait is available and the concerned country where the trait is missing (Mark et al., 2006b). In some extreme cases with very weak genetic ties among available traits (e.g., Mark et al., 2005a), the problem of obtaining suitable rG for missing traits may not be much different than obtaining suitable rG among available traits.
The Interbull Centre applies a procedure to postprocess estimated rG, which could be applied to obtain genetic correlations for missing traits as well. The rules associated with this procedure are largely based on expert intuition. However, applying similar structural models as found in Rekaya et al. (2001) and Mark et al. (2006b) to predict rG seems more desirable as it allows simultaneous consideration of several explanatory effects and because it is less subjective.
Examples of missing traits are milk yield in China, fertility in Australia, and clinical mastitis (CM) in the United States. The latter will be the focus of this study, but the principles can be applied in other situations as well. Clinical mastitis is only recorded and used in genetic evaluations in the Nordic countries. Clinical mastitis information from these countries as well as milk somatic cell (SC; used herein to indicate both SCS and SCC) from the United States and other countries can be used as indirect measures (i.e., indicator traits) of CM in the United States.
International genetic evaluations provide an opportunity to incorporate CM information from Nordic countries into selection decisions in countries without direct CM information (Mark et al., 2002). However, current evaluations do not facilitate optimal use of the CM information in countries without CM records. This is because the CM information is converted to SC breeding values in such countries. More of the CM information could be captured by directly relating CM in the Nordic countries, as well as SC in each country, with CM in the target country.
The aim of this study was to predict rG for a missing trait, investigate the predictive performance of a method to predict international breeding values for missing traits, and determine the sensitivity of the method to the assumed rG. This study focuses on applying the method to predict breeding values for CM in a country that has genetic evaluations for at least one correlated trait (i.e., SC).
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
Variables Potentially Explaining Variation Among Genetic Correlations
Variables used to derive multiple regression equations for rG in this study were obtained from 3 sources, and they could be grouped into 1) climatic variables, 2) production system indicators, and 3) national genetic evaluation descriptors.
The climatic variables were available from the Danish Meteorological Institute and were measured as the average monthly value during 1931 to 1960 in the capital city (Cappelen and Jensen, 2001). These averages were based on several daily measures. The daily minimum and maximum values were each averaged for every month, and the range was calculated as the difference between the highest average maximum and the lowest average minimum monthly value. The variables considered here were country averages of temperature (°C), range in temperature (from coldest to warmest month), country averages of rainfall (mm), range in rainfall, country averages of humidity (%), range in humidity, and country averages of wind speed (Beaufort scale). Squared terms of these variables were also considered.
Production system indicators were available from the International Committee for Animal Recordings yearly enquiries (ICAR, 2006). The most recent statistics were taken from each country. Holstein data were used when available; otherwise, statistics for all dairy breeds were used. The indicators considered were average milk yield (kg) and contents (%) of fat and protein from national milk recordings. Squared terms of these variables as well as interactions among climatic variables and production system indicators were considered.
National genetic evaluation descriptors were taken from the forms that were available on Interbulls home-page (Interbull, 2004). The descriptors that were considered in this study were heritability, number of parities included, whether test-day records were considered, and whether the given trait was analyzed simultaneously with biologically different traits.
Finally, the CB and a variable explaining the effect of trait were considered. Trait was defined as a binary variable: trait = 1 if both involved traits were SC or if both involved traits were CM, and trait = 0 if one of the involved traits was SC whereas the other was CM.
Prediction of Genetic Correlations for Missing Traits
The estimated rG were used as dependent variables in multiple linear regression to obtain regression coefficients that could be used to predict rG involving missing traits. Explanatory variables in this regression were derived from the climatic variables, production system indicators, and national genetic evaluation descriptors described above.
The explanatory variables, except CB, were expressed as either ratios or binary variables. For continuous variables, a ratio was calculated so that the largest of the 2 country averages was in the denominator. Hence, 0 < ratio
1, and a high ratio always indicated that the variable in question was similar in the 2 countries. Likewise, a binary class variable was set equal to 1 if both traits belonged to the same class (e.g., both traits considered the same number of parities); otherwise, it was set equal to 0. The CB was used as is.
All variables were constructed so that the linear regression coefficient was expected to be positive. Variables with negative linear regression coefficient were dropped to ensure that the derived prediction formula would generalize well and be biologically meaningful when applied to missing traits. Effects with unexpected negative regression coefficients could be correlated with hidden confounders, which may take other values in the environment where the missing trait is expressed. The best model for rG was selected based on Mallows C(p).
Bending of Combined Genetic Correlation Matrix
The combined rG matrix for both available and missing traits was not necessarily definitely positive. Therefore, the combined matrix was bent (Jorjani et al., 2004) before the prediction of breeding values for available and missing traits. In this weighted bending procedure, the diagonal elements of the rG matrix were not allowed to change, whereas the allowed changes for the genetic correlations were inversely proportional to the CB. The CB was arbitrarily set to 1,000 for correlations involving missing traits to allow only relatively small changes for the traits of main interest in this study. Changes in rG due to bending were always
0.06.
Prediction of International Breeding Values for Available Traits
International breeding values for available traits were computed with a multiple-trait-multiple-country model (MT-MACE), which treats each country–trait combination (i) as a different but correlated trait (Schaeffer, 2001; Mark and Sullivan, 2006):
![]() | [1] |
where yi = vector of within-country univariately or multivariately deregressed national evaluations adjusted for residual correlations; µi = fixed effect of country-trait mean; gi = vector of random genetic group effects; si = vector of random sire effects; ei = vector of random residuals; Zi = matrix assigning observations to sire effects; and Q = matrix assigning sires in s to group effects in g. The (co)variance of the random variables was as follows:
![]() |
where A = the additive genetic relationship matrix relating bulls with their sires and maternal grandsires; I = an identity matrix; G0 = the genetic (co)variance matrix between traits; and Ri = the (co)variance among elements of ei; it is a diagonal matrix with diagonal elements equal to
2e(i)/EDCMT(i,k) for bull k. The EDCMT are effective independent weighting factors (Sullivan and Wilton, 2001; Mark and Sullivan, 2006) and
2e(i) are the residual variances. The residual variances are assumed equal to (4
2sire(i)/hi ) –
2sire(i), where
2sire(i) is the sire variance and
is the heritability assumed in the national evaluations for each trait, respectively.
Prediction of International Breeding Values for Missing Traits
The vectors of MT-MACE solutions (si) for each available country-trait (i) were subsequently combined into direct breeding values (si+) for a missing trait (i+) using (Henderson, 1977):
![]() | [2] |
where G0ii+ = n x m matrix containing the expected genetic covariances between the m missing and n available traits, and G0ii = n x n (co)variance matrix among the available traits. This formula is a generalization of the equation derived by Klei (1995) for a situation in which a bull has daughter information in only 1 country (i = 1) for a total of 2 countries:
![]() |
where rg = the genetic correlation between the available and missing trait, and
g1 and
g2 = the genetic standard deviation for the available and missing trait, respectively. All elements in G0ii–1 and si are available when solving the MT-MACE equations, but G0ii+ needs to be specified. Note that the prediction formula is independent of reliabilities among breeding values for available traits so that the prediction does not need to be performed centrally.
Analyses and Comparisons
First, a MT-MACE analysis was conducted for all available traits. This analysis included SC from 8 countries and CM from 3 of these countries and was identical to the one presented by Mark and Sullivan (2006). The resulting breeding values from this analysis were labeled reference breeding values. Next, 3 analyses were performed to investigate the predictive performance of equation [2]. Here, either all the Danish, Finnish, or Swedish CM records, respectively, were set missing whereas the exact same (co)variance structure from the 11-trait reference evaluation was maintained. These analyses were repeated, but using predicted rG based on prior information only. In each of these analyses, new prediction formulas for rG were created by omitting estimated rG involving the assumed missing trait. The reference breeding values were compared with the following 4 sets of breeding values from the analyses with a CM trait set missing:
Finally, 2 analyses involving 11 available traits and CM in the United States as a missing trait were conducted.
The potential loss of genetic progress (
Gloss) by using an alternative selection strategy was as follows:
![]() | [3] |
where BV = the reference CM breeding values;
sire = the sire standard deviation in the reference evaluation; i = the ranking based on the reference breeding values; and j = the ranking based on breeding values for either direct, within-country SC or best correlated trait. All traits were standardized so that high breeding values were preferable.
Reliabilities were approximated using the information source method of Harris and Johnson (1998). Reliabilities for different groups of bulls were studied: 1) young bulls (i.e., bulls that were born in 1997 or later and had daughters in only one country). These were studied for both the domestic country (d) and the foreign (f) countries where the bulls have no daughters; 2) export bulls (i.e., bulls with daughters in at least 2 countries and most daughters in the given country); 3) import bulls (i.e., bulls with daughters in the given country, but most daughters in a country other than the given country). Thus, a single bull could be labeled as an export bull in only one country; at the same time being labeled an import bull in 1 to 7 countries.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
![]() | [4] |
The estimates of β0 (P < 0.0001), β1 (P < 0.0001), β2 (P = 0.0007), and β3 (P = 0.0024) were 0.599, 0.275, 1.22 x 10–5, and 0.0331, respectively. The estimated regression coefficients were sensitive to the omission of a clinical mastitis trait from the estimation (Table 2
). However, the changes in regression coefficients partly counteracted each other, so the predicted rG were robust. That is, the difference between rG predicted from equation [4] and rG predicted by a similar equation in which the concerned trait was not used to estimate the regression coefficients was always 0.02 or less, except for the within-country rG between SC and CM in Denmark (difference = 0.07).
|
Figure 1
illustrates that the model for rG regresses observations toward the average estimated rG within each class of trait. This means that the predicted rG involving missing traits, which are predicted based only on prior information, will vary less than estimated rG among available traits. This seems desirable because no available trait is favored more than is supported by data in generating breeding values for the missing trait.
|
Estimated rG, which are based on relatively few CB, could be severely underestimated (Sigurdsson et al., 1996; Mark et al., 2005a). Therefore, estimated rG, which are based on few CB and which are lower than prior expectations, are regressed upward in routine international genetic evaluations performed by Interbull. A single large residual is therefore not necessarily an undesirable feature of prediction equation [4].
The average difference per trait between estimated and predicted rG using prediction equation [4] ranged between –0.038 (CM in Finland) and 0.010 (SC in the United States); suggesting that there were no systematic bias in rG for any trait (Table 3
). The average bias of predicted rG was also close to zero (i.e., –0.03) for within-country rG between SC and CM.
|
|
In a previous study, Mark et al. (2006a) used the estimated rG for SC between 2 countries to determine the rG for CM between the same 2 countries. In the current study, another strategy was followed for predicting rG involving missing traits because the rG for CM should not necessarily follow the rG for SC measured in the same 2 countries. This could, for instance, be the case when one country only considers first-parity information for SC, but 3 parities for CM, which was the case for Denmark here. In addition, some definitions of SC may better describe CM than others; for example, when test-day information is used in certain ways.
The present approach tried to simultaneously utilize all the different similarities between traits (i.e., in terms of their definitions, their genetic evaluation model characteristics, and the environmental conditions in which they are expressed) that gives rise to high estimated rG to predict rG for missing traits. The variables included in prediction equation [4] were therefore based on identified reasons for variation in estimated rG.
The model that was preferred in this study was rather simple because observations did not allow more detailed modeling for effects such as rG type. Only 2 types of rG were considered here: 1) across-country rG between SC and SC as well as across-country rG between CM and CM measured in different countries, and 2) within- and across-country rG between SC and CM. Each of these groups was initially split into 2 groups separating rG for SC from rG for CM and separating rG between SC and CM within and across countries. However, across-country rG between SC and SC was not significantly (P = 0.18) different from across-country rG between CM and CM, which can be explained by the fact that essentially only one reliable across-country estimate of rG between CM and CM was available (i.e., the rG between CM in Denmark and Sweden, because rG involving Finland was based on low CB). Also, the effect of a binary class variable to distinguish between pairs of traits measured in the same or different countries was not significant (P = 0.17), which is likely because the effect of CB already explains some variation due to this.
The approach taken here to predict rG has the advantage that it may be used when there are no indicator traits measured in a certain environment, provided that the environment in question does not deviate greatly from the environments in which the correlated traits were measured. The prediction formula, which was estimated in the current study, should probably not be used for environments that differ noticeably from the environments considered here. Torsell (2007) used the prediction formula of Mark and Sullivan (2006) to predict rG for milk yield in Argentina. Although they found no difference between the average predicted rG and estimated rG, the correlation between predicted rG and estimated rG was almost zero. This illustrates that care should be given to extrapolation properties of equations to predict rG for countries with deviating production circumstances.
The countries considered in this study did not vary much in terms of climate and production system indicators. This could explain why these variables were not important to include in the final model for rG. In addition, the capital city may not represent average production circumstances well. If knowledge of the distribution of cows within countries were available, climate conditions in dense cattle areas could be given more weight.
Today, international genetic evaluations for udder health also include data from warm countries such as Australia, South Africa, and Spain as well as from countries with year-round grazing such as Ireland and New Zealand. The best model to predict rG would probably include additional explanatory effects if data from these countries were considered. Including more available traits in developing the best model for rG would be beneficial as it could increase the robustness of the prediction formula to new environments and because the number of observations (i.e., estimated rG) increases nearly quadratically as a function of the number of available traits considered.
Usefulness of Breeding Values for Missing Traits
Breeding values obtained with equation [2] for assumed missing traits were closer to reference CM breeding values compared with SC breeding values for the same country and with CM breeding values for a different country (Table 5
). This was especially the case for export bulls. The use of predicted rG reduced the correlation between reference breeding values and breeding values for the assumed missing trait, except for CM in Denmark.
|
Within-country SC is not necessarily the best alternative to direct breeding values for missing traits (Table 5
). For example, the best-correlated trait was CM in Sweden when the trait of interest was CM in Denmark. Similarly, SC in Germany-Austria and CM in Denmark had the highest correlation with reference breeding values for CM in Finland and Sweden, respectively.
The choice of selection strategy for the missing trait had a noticeable effect on which bulls had the best breeding values and on the potential genetic progress that could be achieved (Table 6
). The potential loss of genetic progress from selecting the bulls with the 100 highest breeding values was lower for direct CM breeding values compared with breeding values for any other trait than the given. This was the case when either estimated or predicted rG were used in the international evaluation, although the superiority of selecting for the direct trait was less clear with predicted rG.
|
|
The relatively low correlations between reference and alternative breeding values for bulls with most daughters in the given country (Table 5
) also show that there was no substitute for considering data for the trait of interest in the international genetic evaluation, even though breeding values for missing traits were useful. This was especially the case when the domestic bulls were assumed competitive with the best foreign bulls. There were mostly foreign bulls in the top 100 ranking for CM in Finland and Sweden. Therefore, the potential loss of genetic progress (Table 6
) was smaller for Finland and Sweden than for Denmark.
Average reliabilities increased when parity was forced to be equal in prediction equation [4] for rG (Table 7
). However, reliabilities were approximated assuming that genetic parameters were known without uncertainty. Accounting for uncertainty of genetic parameters would result in lower reliabilities (Mark et al., 2005b).
| CONCLUSIONS |
|---|
|
|
|---|
Received for publication March 30, 2007. Accepted for publication June 5, 2007.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. Lassen and T. Mark Short Communication: Genotype by Housing Interaction for Conformation and Workability Traits in Danish Holsteins J Dairy Sci, November 1, 2008; 91(11): 4424 - 4428. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |