|
|
||||||||
1 Department of Animal Science, and
2 Department of Statistics, University of Nebraska, Lincoln 68583-0908
3 Roman L. Hruska U.S. Meat Animal Research Center, Agricultural Research Service, USDA, Lincoln, NE 68583-0908
Corresponding author: Rami M. Sawalha; e-mail: Rami.Sawalha{at}sac.ac.uk.
| ABSTRACT |
|---|
|
|
|---|
Key Words: milk yield test day autoregressive covariance
Abbreviation key: AIC = Akaike information criterion, AR(1) = first-order autoregressive, CS = compound symmetry, TD = test day.
| INTRODUCTION |
|---|
|
|
|---|
Test-day records may be viewed as repeated measures of a single trait within a lactation. A main issue with such a model is to account for the covariance structure of the repeated records. The extreme assumption with the simple repeatability animal model of constant environmental and genetic correlations among different TD records may not be a realistic assumption (Henderson, 1984). Simple repeatability model assumes independent residual effects and constant environmental correlation. Correlated records are expected to be less informative than independent ones. Ignoring such correlations when actually present may result in biased estimates of other variance components parameters. Alternatively, monthly TD records can be viewed as different but correlated traits. No predetermined structure or pattern is assumed with this approach for variances or covariances of different TD records.
The structure to account for (co)variation among random correlated effects of repeated TD records should have fewer parameters than the multiple trait approach to avoid over-parameterization. Most importantly, the proposed structure should effectively model the relationships among the effects in the model. The first-order autoregressive process, AR(1), may satisfy these conditions. The AR(1) covariance structure has only one more parameter than the simple repeatability model, which has the compound symmetry (CS) covariance structure, and allows for nonconstant covariances (Wade and Quaas, 1993).
Harville (1979) proposed the use of an autoregressive process to model covariance structures for random effects of repeated measures in animal breeding. Similarly, Quaas (1984) suggested an AR(1) process to model the residual covariance structure when animals have repeated measures. Kachman and Everett (1989) used the AR(1) structure to model environmental covariances among TD records within a lactation. A correlation coefficient of 0.6 was used to model nonconstant covariances among TD residual effects. Similarly, Vasconcelos et al. (2004) used the AR(1) structure to predict TD records of uncompleted lactations.
Carvalheira et al. (1998) compared expected genetic gain and estimates of genetic parameters with models assuming or not assuming an AR(1) covariance structure among permanent environmental effects within a lactation. They reported that the AR(1) covariance structure for permanent environmental effects within lactation was effective for partitioning total variance and removing noise that would otherwise be confounded with genetic effects. In a subsequent study, Carvalheira et al. (2002a) used a similar model and estimated variance components for milk yield of Holstein, Bruna, and Modicana breeds. The estimates of variance components were different for the different breeds.
Recently, random regression models have been widely studied and evaluated for genetic evaluation at national level in many countries. Random regression models have the advantage of flexibility to account for the environmental and genetic components of the shape of lactation curve. However, random regression models require the estimation of large number of parameters and may not be adequate at early or late stages of the lactation. Kettunen et al. (1998) reported negative values for the genetic correlation between early and late TD records within lactation. Meuwissen and Pool (2001) reported similar accuracy of predicting missing records with autoregression and random regression models. However, random regression models required 4 times more dispersion parameters to be estimated compared with the autoregression models.
An AR(1) covariance structure for the residual effects of TD records has not been evaluated before in animal models. Additionally, estimates of genetic parameters with an AR(1) model need to be compared with those from the currently used 305-d model and from simple repeatability models for TD records.
The objective of this study was to compare 2 differently defined autoregressive covariance structures among TD environmental effects. Estimates of variance components for milk, fat and protein yields, and SCS from models with the AR(1) covariance structures were compared with those from the simple repeatability model and with those from a standard 305-d lactation model.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Test-day intervals were set to 30 d from 6 to 305 d in milk. Records were assigned to TD based on the DIM when they were recorded rather than their ordinal sequence. Only lactations with twice-daily milking were included in the final data set. Lactation records were required to have at least 5 TD records. To eliminate outliers, milk, fat and protein yields, and SCS greater than 3 standard deviations from their unadjusted means were deleted.
Each sire was required to have at least 10 daughters with a first-lactation record. All herds were required to have at least 15 cows. Lactation records were categorized as treated or not with bST. A minimum of 4 TD records treated with bST was required for any lactation record to be classified as a bST-influenced record. Records with partial bST treatments or those that did not agree with these guidelines were deleted.
The final data set included 12,071 first-lactation records of Holstein cows calving from 1996 through 2001. About 9% of the cows were classified as bST treated. More details about the data with numbers of available TD and 305-d records for each trait are in Table 1
.
|
The equation for the linear mixed model in matrix notation for the first 3 models was:
![]() |
where y is a vector of TD records of observations of a trait, ß is the vector of fixed effects, a is the vector of random animal additive genetic effects, pe is the vector of random cow permanent environmental effects, e is the vector of random TD residual effects, and X, Z1, and Z2 are incidence matrices relating TD observations to fixed, random animal additive genetic, and random cow permanent environmental effects, respectively.
The first moment for all 3 TD models was assumed to be E[y] = Xß. The second moments about the means were assumed to be:
![]() |
![]() |
![]() |
where AN is the numerator relationship matrix of order N (the number of animals), IN is an identity matrix of order n (the number of cows with records), It is an identity matrix with variable order t (with t the number of TD records of a cow, possibly as many as 10), ARpet and ARet are first-order autoregressive correlation matrices of order t among TD permanent and residual environmental effects, respectively, and
,
2pe and
are variances of additive genetic, permanent, and residual environmental effects, respectively, and
is the direct product operator.
The 3 TD models also included common fixed effects of herd test date (HTD), bST treatment, and age at calving in 2-mo intervals from 22 to 38 mo. Effect of DIM as (DIM/30.5) within TD interval was included as a quadratic polynomial to adjust for the shape of the lactation curve. Fixed effects for the 305-d model were: herd-year-season (HYS), bST treatment, and age at calving.
Another TD model (US model) with an unstructured covariance was applied to the data to evaluate the environmental covariances and variances among different monthly TD records with variable numbers of intervals. Similar fixed effects and first-moment model assumptions of the other TD models were also assumed for the unstructured covariance model. However, the unstructured covariance model assumes no particular pattern of covariances or variances for overall environmental effects of TD records. The US model can lead to computational difficulties because it includes too many parameters to be estimated.
Data were analyzed with a single-trait animal model using the ASREML program, release 1.0 (Gilmour et al., 2002). This statistical package uses an average information algorithm and sparse matrix methods. Convergence was assumed when both the log-likelihood and estimated parameters did not change for at least 3 consecutive restarts.
The CS model can be considered nested within, or is a reduced form of the ARpe and ARe models. The CS, ARpe, and ARe models share the same fixed effects. The ARpe and ARe models had the same variance components as the CS model except for one additional variance component. Therefore, the log-likelihood of the CS model may be quantitatively compared with those of the ARpe and ARe models with the likelihood ratio test with 1 degree of freedom. The log-likelihood ratio statistic is calculated as twice the difference of the log-likelihood values of the complete and reduced models, which can be compared with the critical
2 values for the desired probability level and appropriate degrees of freedom based on the difference in number of estimated variance component parameters with each model under consideration (McCulloch and Searle, 2000).
Estimates of 305-d genetic variances were also obtained using TD estimates with CS, ARpe, and ARe models as: [3052 x
] for yield traits, where
is the TD additive genetic variance. For yield traits, estimates of 305-d phenotypic variance were derived using estimates from the simple repeatability model with CS covariance structure as: [3052 x
+ 3052 x
2pe + 30.52 x 10 x
], where
2pe and
are the permanent environmental and residual variances, respectively. Using estimates from ARpe and ARe models, estimates of 305-d phenotypic variance for yield traits were derived using the formulas:
, respectively, where
is the AR(1) correlation coefficient. For SCS, similar approaches were used to obtain 305-d estimates of variance components using estimates with TD models. However, the procedure was different by considering that 305-d SCS records are derived by averaging the available TD records rather than accumulating them as with yield traits. For example, the estimates of the phenotypic variance of the 305-d SCS as calculated from TD estimates of variance components with the
ARe model is:
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
|
Estimates of covariances did not decline to zero regardless of the number of intervals between all combinations of TD records. Estimates of covariances seem to reach a plateau mostly after the third or fourth interval between TD records, which may suggest the presence of a common environmental variance for all TD records. Any covariance structure, to be useful, should be able to account for this effect. The patterns of change of estimates of unstructured covariances were the same for all traits: milk, fat and protein yields, and SCS.
The pattern of change of estimates of covariances with the US model agrees well with the defined characteristics of the first-order autoregressive covariance structure. The AR(1) covariance structure allows for a decrease in covariance with increasing interval and assumes the correlation to be strictly a function of number of intervals (Littell et al., 2002). Depending on the value of the correlation coefficient and the number of intervals between measures, the AR(1) covariance structure may allow for correlations to reach an approximate plateau.
Gadini (1997) used 2 trait analyses of TD records for milk, fat and protein yields, and SCS to estimate genetic and phenotypic correlations and variances among TD records but did not report estimates of environmental covariances. The environmental covariances derived from genetic and phenotypic estimates showed a pattern similar to the one in this study except for a few unusual estimates that did not follow the general pattern. The calculated estimates of environmental correlations were also very close to the estimates in this study and were in the range of 0.28 to 0.67 for milk yield, 0.24 to 0.53 for fat yield, 0.23 to 0.62 for protein yield, and 0.21 to 0.64 for SCS. Meyer et al. (1989) reported a similar pattern for environmental correlations among TD records for milk, fat and protein yields. Norman et al. (1999) reported comparable findings with smaller correlations with increased number of intervals between TD records. They also reported the smallest correlations were for pairs of TD records at early and late stages of lactation.
Except for SCS, the largest environmental variances were generally for first or last TD records. However, the largest estimates of variances were not more than 30, 40, 47, and 11% larger than the smallest estimates of variances for milk, fat, protein, and SCS records, respectively. Likewise, Ali and Schaeffer (1987) and Meyer et al. (1989), using fixed regression models, and White et al. (1999) using a random regression model reported the largest variances for milk yields at the beginning and at the end of lactation. Swalve (1995c) found similar patterns of estimates of environmental variances over the course of lactation for milk, fat and protein yields.
The estimates of covariances with the CS model did not change, regardless of the number of intervals between TD records that resulted in smaller estimates of covariances between adjacent records and larger estimates of covariances between more separated TD for milk, fat, and protein records (Figures 1B
and 2B
).
Estimates of overall environmental correlations (re) among TD records with the CS model were obtained as the ratio of estimates of permanent to the total environmental variances (re =
2pe / (
2pe +
), where
2pe and
are the permanent environmental and residual variances, respectively). Estimates of overall environmental correlations were 0.47, 0.30, 0.40, and 0.45 for milk, fat, protein, and SCS, respectively. These estimates are assumed the same among all TD records, regardless of the number of intervals among the records. Estimates of environmental variances with the CS model were generally similar to the mean of environmental variance estimates obtained with the US model. The CS model obviously does not seem to be a close approximation for the covariances among most TD records.
The estimates of environmental covariances with the ARpe model decreased as the number of intervals between records increased but at a decreasing rate (Figures 1B
and 2B
). The environmental covariances between TD records did not reach a plateau, regardless of the number of intervals between any of the 10 TD records. This result may be explained by the large estimates of the autoregressive correlation coefficient (
) with the ARpe model (0.82 to 0.90).
Estimates of overall environmental correlations (re) for different TD intervals with the ARpe model can be obtained as function of residual and permanent environmental variances and the AR(1) rho correlation coefficients as re = (
2pe x
npe) / (
2pe +
), where
2pe is the permanent environmental variance,
npe is the autoregressive correlation coefficient for number of intervals (n) between the TD records, and
is the residual variance. Estimates of overall environmental correlations were in the ranges of 0.23 to 0.63 for milk, 0.10 to 0.41 for fat, 0.17 to 0.57 for protein, and 0.23 to 0.55 for SCS. Estimates of overall environmental variances with the ARpe model were in very close agreement with estimates obtained with the US model. Several studies have reported the use of a similar definition of the AR(1) covariance structure as the ARpe model to account for the nonconstant environmental covariances among TD records (Carvalheira et al., 1998; 2002a,b; Vasconcelos et al., 2004).
The ARe model with the AR(1) covariance structure among the residuals resulted in covariance matrices most analogous to the general covariance structures as estimated with the US model for all traits. As shown in Figures 1B
and 2B
, estimates of environmental co-variances with the ARe model were very close to the median points for unstructured covariances for most TD intervals. Estimates of covariances among adjacent records decreased markedly at the beginning and continued to decrease, but at a much slower rate, as the number of intervals between the TD records increased, which resulted in a plateau after the third or fourth TD intervals. The estimate of the total environmental covariance between any 2 records cannot drop below the estimate of permanent environmental variance. Norman et al. (1999) found that correlations among TD records for milk yield traits and SCS were well modeled with the autoregressive correlation structure. They compared the AR(1) with other structures such as identity (I), intercept (J), and heterogeneous variances at the beginning, middle, or late stages of lactation among other structures.
Estimates of overall environmental correlations (re) with the ARe model can be obtained as re = (
2pe +
x
npe) / (
2pe +
), where
2pe is the permanent environmental variance,
is the residual variance and
npe is the autoregressive correlation coefficient for number of intervals (n) between TD records. Estimates of overall environmental correlations with the ARe model were in the range of 0.39 to 0.62 for milk, 0.24 to 0.42 for fat, 0.32 to 0.56 for protein, and 0.39 to 0.56 for SCS. The ranges of the estimates of overall environmental correlation with the ARe model were similar to the range of estimates obtained with the unstructured covariance model.
Based on the above comparison of the environmental covariance structures with the different TD models, a preliminary conclusion may be drawn that the AR(1) covariance structure for the TD permanent environmental and residual effects may be a suitable approximation for modeling of environmental covariance matrices for TD milk, fat and protein yields, and SCS. More quantitative tests will be carried out to compare the performance of the suggested models.
Likelihood Ratio Tests
A more objective way to compare the models is to calculate the log-likelihoods of the models and conduct tests of significance among them whenever possible. The log-likelihood values are shown in Tables 2
, 3
, 4
, and 5
for milk, fat and protein yields, and SCS, respectively. Based on the likelihood ratio tests, ARpe and ARe models fit the data significantly better than the simple repeatability model with the CS covariance structure (CS model) for all 4 traits. All of the calculated likelihood ratio statistics were much larger than the critical
2 values. The smallest calculated statistic was 2158 between the ARe model and the CS model for fat yield. This value is about 199 times larger than the critical
2 value at P = 0.001 with 1 degree of freedom, which is 10.83. The Akaike information criterion (AIC) gave similar results for the ARpe and ARe models compared with the CS model for all 4 traits. The better fit of the ARpe and ARe models to the data than the CS model may indicate that the estimates of the variance components with ARpe and ARe models are more accurate, given the data.
|
|
|
|
2 value thus has zero degrees of freedom.
Estimation of Variance Components
Estimates of variance components with all 4 models are presented in Table 2
for milk yield, in Table 3
for fat yield, in Table 4
for protein yield, and in Table 5
for SCS. Estimates of heritability were in the range of 0.08 to 0.11 for yield traits, and heritability was 0.06 for SCS with different TD models. Estimates of heritability with the 305-d model were in the range of 0.14 to 0.36 for yield traits, and heritability was 0.11 for SCS. The standard errors of all estimates of variance components and heritability were small, which indicates that the size of the data set was adequate. Furthermore, none of the estimates of parameters was fixed at boundaries or out of the parameter space.
Estimates of heritability in the literature are variable with both TD and 305-d lactation models. Estimates vary for different populations or datasets, countries, regions, periods or year, methods, and models of analysis and other factors. Estimates with a cubic spline random regression model were as small as 0.09, 0.06, 0.09, and 0.02 for TD milk, fat, protein, and SCS, respectively (DeGroot, 2003). Similarly, Tsuruta (1998) reported small estimates of heritability in the range of 0.10 to 0.26, 0.05 to 0.12, 0.09 to 0.21, and 0.03 to 0.09 for TD milk, fat, protein, and SCS, respectively. Other researchers have reported greater estimates of heritability for TD yield traits (Auran, 1976; Danell, 1982; Pander et al., 1992). Reents et al. (1994) reported estimates of heritability for SCS of 0.06 to 0.10 for TD records in the first lactation. Heritability estimates for milk yield using 305-d lactation records have been reported to be as large as 0.49 (Pander et al., 1992) and 0.39 (Swalve, 1995a) and as small as 0.24 (Visscher and Goddard, 1995).
Estimates of autoregressive correlation coefficients (
) for the residuals with the ARe model were small and in the range of 0.23 to 0.38. The small estimates of the correlation coefficient for residual effects is because ARe model also includes permanent environmental effects that are assumed to be common for all TD records for each cow. Estimates of the autoregressive correlation coefficient with the ARpe model for the permanent environmental effects, which are expected to be highly repeatable among different TD records, were large (0.83 to 0.90). Carvalheira et al. (1998) reported estimates of the AR(1) correlation coefficients among permanent environmental effects in the range of 0.57 to 0.83 for milk yield in Holstein and Lucerna cattle. Their moderate estimates may be because their model included an additional permanent environmental effect along with the autocorrelated ones and independent residual effects.
Estimates of repeatability (r) with the different TD models may be obtained by the general formula of r = (
+
npe
2pe +
)/(
+
2pe +
), where
npe is the autoregressive correlation coefficient among TD permanent environmental effects and is 1 for all models except for ARpe model,
is the autoregressive correlation coefficient among TD residual effects and is 0 for all models except for the ARe model, n is the number of intervals between TD records,
is TD additive genetic variance,
2pe and
are TD permanent environmental and residual variances, respectively. Estimates of repeatability with the CS model were in the range of 0.36 to 0.52 for different traits. Estimates of repeatability with the ARpe model were 0.29 to 0.66 for milk yield, 0.19 to 0.48 for fat yield, 0.23 to 0.60 for protein yield and 0.28 to 0.58 for SCS. Estimates of repeatability with the ARe model had larger lower limits and similar upper limits as estimates with the ARpe model. The smaller lower limits of repeatability with the ARpe model are due to the large estimates of the correlation coefficient with the ARpe model. The large correlation coefficient allows the permanent environmental covariance between TD records with the ARpe model to considerably decrease with the increase in number of intervals between records without reaching plateau.
The ARe model with the AR(1) covariance structure among the residual effects resulted in smaller estimates of heritability for all yield traits than with the CS model. Estimates of additive genetic and permanent environmental variances for yield traits were always smaller with the ARe model than with the CS model. Similarly, estimates of total environmental and phenotypic variances were slightly smaller with the ARe model than with the CS model for all traits.
Estimates of residual variances were always less with the CS model than with the ARe model. The increases in estimates of residual variance with the ARe model compared with estimates with the CS model were about 16% for milk yield, 7% for fat yield, 14% for protein yield, and 10% for SCS. The residual variance can become smaller if a nonzero correlation among repeated measures on a given subject is ignored.
Estimates of heritability for 305-d lactations were also calculated using linear functions of estimates of variance components obtained with the TD models. The estimates of 305-d heritability were larger when the estimates of TD variance components were obtained with either the CS model or ARe than when they were obtained with the ARpe model (Tables 2
through 5![]()
![]()
). This result may be due to the larger estimate of permanent environmental variance and the associated 305-d estimate of phenotypic variance with the ARpe model compared with the CS and ARe models. The 305-d heritability estimates from the CS model were slightly larger than those estimated by the ARe model. Except for fat yield and SCS, the estimates of 305-d heritability with the CS model or ARe model were larger than those obtained directly with the 305-d model.
The amount of computational time required to obtain genetic evaluations with different models were compared relative to the ARe model (Table 6
). The use of the ARe model will require a relatively similar amount of computational time as the CS model (from 9.2% less to 12.9% more time). In all cases, the ARpe model required more computational time than did the ARe model (9.7 to 27.1% more time). All TD models required a considerable increase in the amount of computational time required compared with the 305-d model.
|
| CONCLUSIONS |
|---|
|
|
|---|
Received for publication December 14, 2004. Accepted for publication March 31, 2005.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
E. Kramer, E. Stamer, J. Spilke, G. Thaller, and J. Krieter Analysis of water intake and dry matter intake using different lactation curve models J Dairy Sci, August 1, 2009; 92(8): 4072 - 4081. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. K. Toshniwal, C. D. Dechow, B. G. Cassell, J. A. D. R. N. Appuhamy, and G. A. Varga Heritability of Electronically Recorded Daily Body Weight and Correlations with Yield, Dry Matter Intake, and Body Condition Score J Dairy Sci, August 1, 2008; 91(8): 3201 - 3210. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Konig, F. Kohn, K. Kuwan, H. Simianer, and M. Gauly Use of repeated measures analysis for evaluation of genetic background of dairy cattle behavior in automatic milking systems. J Dairy Sci, September 1, 2006; 89(9): 3636 - 3644. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Sawalha, J. F. Keown, S. D. Kachman, and L. D. Van Vleck Genetic Evaluation of Dairy Cattle with Test-Day Models with Autoregressive Covariance Structures and with a 305-d Model J Dairy Sci, September 1, 2005; 88(9): 3346 - 3353. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |