JDS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Druet, T.
Right arrow Articles by Ducrocq, V.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Druet, T.
Right arrow Articles by Ducrocq, V.
J. Dairy Sci. 86:2480-2490
© American Dairy Science Association, 2003.

Modeling Lactation Curves and Estimation of Genetic Parameters for First Lactation Test-Day Records of French Holstein Cows

T. Druet1, F. Jaffrézic, D. Boichard and V. Ducrocq

Station de Génétique Quantitative et Appliquée, INRA, Jouy-en-Josas 78352, France


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Several functions were used to model the fixed part of the lactation curve and genetic parameters of milk test-day records to estimate using French Holstein data. Parametric curves (Legendre polynomials, Ali-Schaeffer curve, Wilmink curve), fixed classes curves (5-d classes), and regression splines were tested. The latter were appealing because they adjusted the data well, were relatively insensitive to outliers, were flexible, and resulted in smooth curves without requiring the estimation of a large number of parameters.

Genetic parameters were estimated with an Average Information REML algorithm where the average information matrix and the first derivatives of the likelihood functions were pooled over 10 samples. This approach made it possible to handle larger data sets. The residual variance was modeled as a quadratic function of days in milk.

Quartic Legendre polynomials were used to estimate (co)variances of random effects. The estimates were within the range of most other studies. The greatest genetic variance was in the middle of the lactation while residual and permanent environmental variances mostly decreased during the lactation. The resulting heritability ranged from 0.15 to 0.40. The genetic correlation between the extreme parts of the lactation was 0.35 but genetic correlations were higher than 0.90 for a large part of the lactation. The use of the pooling approach resulted in smaller standard errors for the genetic parameters when compared to those obtained with a single sample.

Key Words: genetic parameters • lactation curve • test-day model

Abbreviation key: AIC = Akaike’s Information Criterion, BIC = Schwarz’ Bayesian Information Criterion, DCC = days carried calf, DO = days open, MSSE = mean sums of squares of residuals, TD = test-day


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
In recent years, numerous studies have investigated the topic of genetic evaluation of dairy cattle using test-day (TD) models. Advantages of the TD model over an approach using 305-d lactation yields are now widely acknowledged. France, like many other countries, plans to implement a TD model for routine genetic evaluation. However, some issues must still be investigated because there is a large variation in models among countries.

Indeed, several approaches have been used for modeling the fixed part of the lactation curve (Jensen, 2001) such as fixed classes curves using classes of DIM (Pool et al., 2000), or parametric curves such as the Ali-Schaeffer curve (Ali and Schaeffer, 1987), the Wilmink curve (Wilmink, 1987) or orthogonal polynomials (Olori et al., 1999). For the modeling of random effects, this variability has ranged from parametric curves to the use of eigenvectors to reduce the rank of the (co)variance matrices. Recently, White et al. (1999) proposed the use of natural cubic splines.

The first step in implementing a routine evaluation with TD models is to estimate variance components. A large heterogeneity of the estimated genetic parameters (Misztal et al., 2000) indicated the need to use large data sets. Unfortunately, the TD models are computationally very expensive and therefore data sets of reduced size typically have been used. Several studies relied on Gibbs sampling but the size of the data sets was still limited and chains took very long to converge (Jamrozik et al., 1998).

In addition to the large data sets required, some authors have found it important to model the heterogeneity of the residual variance across the lactation (Brotherstone et al., 2000; Jaffrezic et al., 2000). Most methods work with different classes of residual variance across the lactation but these require a large number of parameters or do not result in a smooth residual variance curve.

The objective of this study was to develop tools for the genetic evaluation of dairy cattle with TD models in France. The specific tasks were: 1) to compare spline functions to more traditional curves for the fixed part of the lactation curve, 2) to develop a strategy of variance component estimation for random regression applicable to very large data sets, 3) to develop a method for taking into account the continuous heterogeneity of the residual variance over the lactation period that does not require a large number of parameters, and 4) to estimate variance components for milk yield of first parity cows.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Data
The data were selected from 1,699,925 first lactation records of Holstein cows from the four French administrative regions of Bretagne. The following edits were applied to the data: calving date was required to be in the period from August 1994 to July 2000, and age at first calving had to be from 20 through 38 mo. Records with fewer than 40 d open, from animals with unknown parents, or with unknown calving age were also discarded. Only lactations with at least four TD records known were considered and lactation stage of TD records had to be between 5 and 305 d. Herds with an average of at least 72 TD records per year were kept. Finally a random selection on herd number was applied to create 10 data sets, each with approximately 10,000 cows with records. Those data sets are characterized in Table 1Go.


View this table:
[in this window]
[in a new window]
 
Table 1. Numbers of test-day records, animals with records, animals in pedigree files and herd-test-date effects for the 10 analyzed samples.
 
Lactation length was limited to 305 d for several reasons: 1) 305 d are the present official lactation length in France and for ICAR, 2) TD records after 305 d might bring only limited additional information and might not reflect the ability to produce during the 305-d lactation period, 3) any TD records after 305 d might be more difficult to model and problems in their modeling might affect the evaluation for the entire lactation, 4) taking those TD records into account might penalize cows with good fertility and shorter lactations. However, we plan to investigate the use of longer lactation length in the future.

Model
Lactation curves.
Five different functions were compared to model the fixed part of the lactation curve: a fifth order (quartic) Legendre polynomial, the Wilmink curve (Wilmink, 1987), the Ali-Schaeffer curve (Ali and Schaeffer, 1987), a fixed classes curve with 5-d classes for DIM (30 classes) and regression splines as defined by White et al. (1999). The regression splines are equivalent to the natural cubic spline without roughness penalty. In this study, six knots were chosen at 5, 20, 50, 130, 230 and 305 DIM.

To compare these functions, the same fixed effect model was assumed:


([1])

where yijk is the milk record, HTDi is the herd by test-date effect, i is the HTD level, f1 is the tested function, agej represents the age at calving in class j (nine classes: 20–22, 23–24, 25–26, 27–28, 29–30, 31–32, 33–34, 35–36, and 37–38 mo), monthk represents the month at calving k (12 classes) and eijk is the residual term. In total, there were 21 different curves. Curves for the age effect were added to the curves for the month at calving effect because both were assumed independent.

Effect of gestation.
After comparison of the lactation curves, three functions were tested to model the effect of the gestation or days carried calf (DCC). DCC = DIM - DO where DO is the number of days from calving to successful insemination. When there is no successful insemination DO = DIM and DCC = 0. If DCC was fewer than 100 d, the effect of the gestation on milk production was assumed to be nil. The DCC effect ranged from 100 to 265 d because the lactation length was limited to 305 d and minimum DO was 40 d. The effect of DCC was not allowed to differ with DIM because large DCC values could only be observed in late lactation stages while small DCC values were assumed to have small effects not varying much with DIM. The three tested functions were Legendre polynomials (order five), a fixed classes curve with 5-d classes and regression splines with four, five and six knots at DCC 100/150/200/265, 100/140/180/220/265 and 100/133/166/199/232/265, respectively. The new model was:


(2)

where f1 is the function chosen for modeling the lactation curve (regression splines—see results) and f2 is the tested function for modeling the gestation effect.

Model for the estimation of the genetic parameters.
In addition to the fixed effects tested earlier, random effects were added to the model:


(3)

where f2 is a regression spline as described earlier with DCC ranging from 0 to 265 d of gestation and with five knots chosen at 0, 100, 150, 200 and 265 DCC, alm and plm are the random additive genetic and permanent environmental regression coefficients of animal l for the mth term of the Legendre polynomial of order 5, respectively. Parameter {varphi}(m,t) is the value of the mth term of the Legendre polynomial at time t (DIM standardized from -1 to 1) as in Kirkpatrick et al. (1990), and hynm is the random regression coefficient of the herd by year of calving n for the mth term of the Legendre polynomial of order four. Legendre polynomials were chosen because in preliminary studies, it was found that the variance component structures (eigenvectors and eigenvalues) obtained by this method and by an unstructured model (with 10 by 10 unstructured covariance matrices, corresponding to a multi-trait model) were very similar.

The phenotypic covariance matrix V of the observations is given by:


([4])

where G, P, and H are the random regression covariance matrices for the genetic, permanent environmental and herd by year of calving effects, respectively, A is the additive genetic relationship matrix, and R is the diagonal matrix of the residual variance that depends on DIM: where q denotes DIM and


([5])

where which is proportional to the second term of the Legendre polynomial. Parameters a, b, and c describe the variation of the residual variance over DIM, and the exponential function ensures that the residual variance is always positive. Other sources of heterogeneity of variance were not included in the model for genetic parameter evaluation. However, they are planned to be modeled in the national genetic evaluation system and are under study.

Method
Lactation curves comparison.
Lactation curves were compared using five criteria. First, the mean sum of squares of the residuals (MSSE) were computed for the first data sample:


([6])

where n is the number of records. This parameter indicates the overall fit of the curve. Second, the mean residual was computed for each DIM:


([7])

where nd is the number of records at DIM = d. By plotting this second parameter over DIM, it is possible to check whether the fit is adequate or if there is a bias at any lactation stage.

In addition to those two criteria, twice the opposite of the logarithm of the likelihood function and the REML forms of Akaike’s Information Criterion (AIC) and Schwarz’ Bayesian Information Criterion (BIC) were computed (e.g., Meyer, 2001).

Effect of gestation.
The three functions were already compared in the previous part. The effect of DCC were plotted for each function for the first sample.

Method for estimating the genetic parameters.
A program from Ignacy Misztal and Shogo Tsuruta (Misztal et al., 2002) based on the Average Information REML algorithm and relying on the work of Jensen et al. (1996) was modified in order to accommodate all the features of the models to be tested. The program was extended to random regression models and to correlated effects and traits. Computation of the f vectors (see Jensen et al., 1996 and below) and combination of Average Information-REML with an EM-REML algorithm (in the case of non-positive definite matrices) were rewritten.

Model for the estimation of the residual variance.
The residual variance was described with a function as in equation [5]Go. For this purpose, first derivatives of the log-likelihood and of the f vectors had to be computed for the three parameters (a, b and c) of .

In Jensen et al. (1996), V is equal to var[y] or to ZGZ'+R, and P is a projection matrix mapping observations into weighted residuals:


([8])

The average information matrix IA({theta}) (where {theta} is the vector of parameters) can be computed as (Jensen et al., 1996):


([9])

where W = [X Z] and F is a matrix whose jth column fj (f vector) consists of the vector

One needs to compute which is equal to Rjk, a symmetric indicator matrix containing ones in positions corresponding to parameter {theta}R{j,k} (which is the parameter of the residual variance matrix corresponding to (co)variance between traits j and k) and zeros elsewhere. In the case of a constant variance and one lactation, there is one single parameter in {theta} corresponding to the residual variance ({theta}R{1,1}). In the case of a residual variance described by three parameters a, b and c, {theta} contains three parameters related to the residual variance and:


([10])

Similarly,


([11])

Derivatives obtained from equations [10]Go and [11]Go are easy to implement in the program because the computation of the first derivative of the log-likelihood and of the f vectors are (Jensen et al., 1996):


([12])


([13])

Here, Rij is replaced by values given in equations [10]Go and [11]Go. This leads to some simplifications of equations [12]Go and [13]Go because R is multiplied by its inverse. The Average Information-REML algorithm can easily be extended to other parametric functions for the residual variance.

Pooling method for the estimation of the genetic parameters.
In addition, the Average Information-REML program was transformed for the simultaneous estimation of the genetic parameters of several samples. Instead of combining the genetic parameters obtained from different samples a posteriori, the approach cumulated the likelihood, the first derivatives and the average information matrix over several samples (see Babb, 1986; Yerex, 1988; Ducrocq, 1993) such that:


([14])

where L({theta}) is the log-likelihood for all samples for the set of parameters {theta}, and Li({theta}) is the log-likelihood for sample i for the same set of parameters and n is the number of samples.


([15])

where {theta}j is the jth parameter and


([16])

The set of parameters {theta} is updated after these terms are computed for all the samples using:


([17])

where (nr) is the parameter estimation at iteration nr.

In addition, with the Average Information-REML algorithm, one can obtain standard deviation of the estimated parameters through the inverse of the average information matrix. The comparison of these standard deviations obtained from one single sample or from the ten samples together was used to measure the gain of precision with the pooling method. This comparison was applied for the estimates of (co)variances of the coefficient of the random regression model. The variances (and thus the standard deviations) of the correlations were also computed with the following approach, known as the "delta method" (Oehlert, 1992):


([18])

where f({theta}) here is the function computing the correlation between two given DIM and var({theta}) is the asymptotic variance of the parameters obtained from the inverse of the Average Information matrix in our case.

Transformation of the genetic parameters on a 305 DIM scale.
The genetic variance matrix for all DIM was obtained as:


([19])

where G* is a 301 by 301 genetic (co)variance matrix for all DIMs ranging from 5 to 305 d. {Phi} is a 301 by 5 matrix with the values of the five coefficients of the fifth order Legendre polynomial for each DIM from 5 to 305 d. The same operation was applied to P and H. The eigenvectors of G* and P* were computed subsequently.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Lactation Curves
The ranking of the functions with different criteria are presented in Table 2Go. According to the MSSE criterion, the fixed classes curve performed better than the others for which the differences in MSSE were marginal. Except for the fifth order Legendre polynomial, the ranking of the functions was simply the reverse of the ranking of the number of parameters per curve, 60, 6, 5, 3 and 5, respectively. This clearly explained why the fixed classes curve had the best fit. These results were consistent with our previous studies on other samples. The ranking based on the logarithm of the likelihood function or on the AIC criterion was the same as for the MSSE, with a permutation between the spline function and the Ali-Schaeffer curve. However, with the BIC criterion, with a stronger penalty for large number of parameters, the fixed classes curve ranked last. Also for these three criteria, some differences between curves were marginal. With criteria based on goodness-of-fit, it is not trivial to conclude which of these curves is the best and different rankings might be obtained across samples or studies. Moreover, a compromise, often based on subjective grounds, must be accepted between goodness-of-fit and other properties such as flexibility, robustness and computational considerations. This explains the large variation in models across studies and countries.


View this table:
[in this window]
[in a new window]
 
Table 2. Mean sum of squares of residuals (MSSE), -2*logarithm of the likelihood function (-2logL), AIC and BIC values for the fifth order Legendre polynomial, the Wilmink function, the Ali-Schaeffer curve, a fixed classes curve and a regression spline function.
 
Figure 1Go shows the trend of the mean residuals over DIM. For the non-parametric curve, as expected, the mean residuals were zero for all parts of the curve. This was also true for the regression spline except for a small deviation at the beginning and the end of the lactation. The last three curves showed biases throughout the lactation. For instance, all of them underestimated milk yield at the peak and overestimated it in early lactation indicating that they were not able to model all the variation of the curve. This was also found by Jamrozik et al. (1997) for the Ali-Schaeffer and Wilmink curves. Because the Wilmink curve had only three parameters, it had reduced flexibility. For instance, the slope of the curve was modeled as a constant from the peak to the end of the lactation, although there seemed to be an inflection point around 200 d. As a consequence, the Wilmink curve was systematically too high before the inflection point and too low after it. The Ali-Schaeffer curve and the Legendre polynomial showed problems throughout the lactation. The Legendre polynomials had typical border effects at the beginning and the end of the lactation (increased bias) and showed waves in the middle of the lactation. Parametric curves were not flexible enough to track all trends in the data.



View larger version (17K):
[in this window]
[in a new window]
 
Figure 1. Trend of mean residual in function of DIM for: —— fixed classes curve; —{diamondsuit}— spline function; —{blacktriangleup}— Wilmink curve; —•— Ali and Schaeffer curve; —{triangleup}— fifth order Legendre polynomial.

 
For the fixed classes curve, the influence of the data is local: a record observed in one class does not influence the estimate for any other class. In addition, because it has a very large number of parameters, it shows great flexibility. This can be a problem, however, for classes with a small number of observations. For the three parametric curves, any record influences the whole curve and the flexibility is less important than for the fixed classes curve. As a consequence, atypical curves can be obtained when there are outliers. For instance, with the Wilmink curve, if some points at the end of the lactation are particularly high, the whole part of the curve after the peak will rise. Regression splines are semi-parametric curves. The influence of the points is not completely local but records have an influence only on the parts of the curve close to them and they show greater flexibility than the other parametric curves. Therefore, they offer a compromise between the fixed classes curve and the parametric functions with better flexibility than parametric curves and a reduced number of parameters.

Effect of Gestation
Figure 2Go shows curves obtained for the effect of DCC for different functions. First, the fixed classes curve had good flexibility and local behavior. However, the curve had random fluctuations that might be due to sensitivity to the data rather than to real biological reasons and might be undesirable. When the number of records within classes decreases, the sampling variance of the estimate of the mean of the class increases. This creates more local variation that might result in inconsistent curves because of few unexpected records. For instance, the effect for the last class was much higher than for the other classes. Also, there was no reason for the DCC effect to decrease after 230 d of gestation. This might arise from missing components in the models and might disappear when all effects, such as the animal genetic effect, are included. This problem was also observed for the Legendre polynomial, for which the DCC effect also increased at the end of the lactation. In fact, Meyer (1998) observed that data points at the beginning and at the end of the lactation trajectory have a relatively large impact on the regression coefficient estimates.



View larger version (11K):
[in this window]
[in a new window]
 
Figure 2. Effect of days carried calf (DCC) on test-day milk production modeled with: 1) 5 d classes curve, 2) Fifth order Legendre polynomial, 3) Spline function with 4 knots, 4) Spline function with 5 knots and 5) Spline function with 6 knots.

 
In contrast to fixed classes curves, splines showed little local variation. With cubic splines, there is a roughness penalty parameter which penalizes rapid variation and smoothes the curve (White et al., 1999, Green and Silverman, 1994). The purpose of smoothing is to reduce random fluctuation associated with some outlier records. In addition to the roughness penalty, the limitation of the number of knots has also a smoothing effect. Regression splines rely only on this second source of smoothing (White et al., 1999). So, as the number of knots increases, the curves fit the data better and become rougher and more sensitive to the data. The regression splines are again a compromise between the fixed classes and the parametric curves. The balance between all these properties can be adjusted by selection of the number and the position of the knots. The regression spline with 6 knots was more sensitive to the data; and the last part of the curve was essentially influenced by the few records between knots at 232 and 265 DCC. With the second curve, data ranging from 220 to 265 DCC determined the end of the curve, resulting in a decreasing curve but with an inflection point. Points between 220 and 232 DCC were more numerous and greatly influenced the shape of the end of the curve. Finally, the last regression spline, with the last knots at 200 and 265 DCC, was influenced little by the few points after 232 DCC and resulted in a smoother and decreasing curve without an inflection point.

In conclusion, regression splines offer a good compromise between goodness-of-fit, sensitivity to the data, smoothness, local behavior and the number of parameters necessary to fit the curve. Additionally, fixed classes curves do not rely on prior assumptions about the shape of the curve. They can also be used for other traits such as fat or protein contents or for other species and traits.

Regarding the gestation effect, different curves for the effect of DCC would result in a different impact on high milk yielding cows with fertility problems. The regression spline with four knots, with the effect of DCC increasing exponentially, seemed to be in agreement with the results from Smith and Legates (1962) and Olori et al. (1997). This makes biological sense as the growth of the embryo is exponential. However, it might be that the curve described by the Legendre polynomial or the spline with 6 knots corresponds to biological reality. In all cases, the magnitude of the effect must be more properly estimated with all the other terms (e.g., additive genetic, permanent environment) included in the model.

Estimation of the Genetic Parameters
The method to model the residual variance as a continuous function of DIM resulted in estimates very close to the ones obtained in a multivariate analysis with 10 classes calculated with ASREML (Gilmour et al., 2000). The use of this function allowed a reduction of the number of parameters needed to fit the data (Pool and Meuwissen, 2000). The method seemed more appealing than the use of a finite number of classes as in Olori et al. (1999) and Rekaya et al. (1999) because it allows to model the continuous changes of the residual variance over time with a small number of parameters (only three here).

Variance function estimates obtained are presented in Figure 3Go. These variances were in agreement with multi-trait studies where lactation curves were split into different traits, as presented by Meyer et al. (1989), Pander et al. (1992), Swalve (1995), Rekaya et al. (1999), and Pool et al. (2000). According to these studies, the residual variance was found to be decreasing throughout the lactation with a slight increase at the end. Also in agreement with these authors, the genetic variance was highest in mid-lactation and estimates were lower at the beginning and the end of the lactation.



View larger version (13K):
[in this window]
[in a new window]
 
Figure 3. Variances across DIM for milk yield in first lactation: —•— residual variance, —{blacksquare}— genetic variance, —{blacktriangleup}— permanent environmental variance and ——herd-year variance.

 
Published results obtained in different studies with random regression models were very heterogeneous (Misztal et al., 2000). The shapes obtained in this study were in agreement with Pool et al. (2000) and Mayeres (2002) both using Legendre polynomials. Olori et al. (1999) obtained an increasing genetic variance and Jakobsen et al. (2001) found a highly increasing residual variance at the end of the lactation (higher than at the beginning of the lactation) and an increasing genetic variance throughout the lactation. Variance of the random herd by year of calving effect was similar to that obtained in previous studies (Gengler and Wiggans, 2001; Auvray and Gengler, 2002; De Roos et al., 2002). The shape and amplitude were the same as well.

Maximum heritability was close to 0.39 and was found around 200 DIM. The minimum was at the beginning of the lactation and there was a decreasing heritability at the end (see Figure 4Go). This was a consequence of the maximum genetic variance in the middle of the lactation and the decreasing residual and permanent environmental variances. Heritability ranged from 0.16 to 0.39 which are medium values in comparison with other studies. Most multi-trait analyses also found the highest heritability in mid-lactation. This also was true for the study from Liu et al. (2000) with the lactation separated into 6 traits with a covariance function fit in a second step. Similar results were obtained by White et al. (1999) working with cubic splines or in studies based on random regression from Pool et al. (2000), Auvray and Gengler (2002), Jakobsen et al. (2001) or Mayeres (2002). However, in some of these studies, this similar heritability was obtained from very different variances, especially with a very high residual variance both at the beginning and the end of the lactation. Jamrozik and Schaeffer (1997), Kettunen et al. (1998), Samoré et al. (2002) and Strabel and Misztal (1999) estimated highest heritability at both extremes of the lactation curve. Ranges of heritabilities across the lactation varied considerably among studies, ranging from as low as 0.10 (e.g., Strabel and Misztal, 1999) to 0.60 (e.g., Jamrozik and Schaeffer, 1997). The values in Figure 4Go seemed moderate to high but not extreme. The range of heritabilities of most multi-trait studies was from 0.20 to 0.35 and higher in some cases. Our results are close to those of Liu et al. (2000) and Jakobsen et al. (2001) and lower than results from Jamrozik and Schaeffer (1997), Kettunen et al. (1998) and Olori et al. (1999) who found heritabilities higher than 0.50 for some parts of the curve.



View larger version (8K):
[in this window]
[in a new window]
 
Figure 4. Heritability of milk yield across DIM for first parity.

 
Genetic correlations between effects at different DIM are presented in Table 3Go. The genetic effects for the TD yields in the middle of lactation were highly correlated, with correlations higher or equal to 0.90 between the genetic effect at DIM 155 and genetic effects of a large part of the lactation (from 45 to 280 DIM). The genetic effects at the beginning of the lactation were less correlated and seemed to be a somewhat different trait. Genetic correlation was equal to 0.35 for the extreme parts of the lactation, which seemed reasonable. To compare these genetic correlations with estimates from multi-trait analyses, the effect at DIM 20 should be compared to class 1 (the median of the first 30 d class) and the effect at DIM 290 to class 10, between these two extremes all correlations were higher than 0.55. The correlations obtained are in agreement with those found by multi-trait analyses by Meyer et al. (1989), Pander et al. (1992) and Kettunen et al. (1998) and with estimated correlations obtained from random regression analysis by Brotherstone et al. (2000) and Olori et al. (1999). Rekaya et al. (1999) with multi-trait analysis and White et al. (1999) found even larger genetic correlations while Strabel and Misztal (1999) found lower correlations. Finally, Jamrozik and Schaeffer (1997), Kettunen et al. (1998) or Rekaya et al. (1999) obtained negative correlations for extreme parts of the lactation when working with random regressions.


View this table:
[in this window]
[in a new window]
 
Table 3. Genetic correlations (above diagonal) and their standard deviations (below the diagonal) between daily milk yield at d 5, 20, 50, 80, 155, 230, 275, 290 and 305.
 
Differences between studies might be explained by the modeling of the fixed part of the lactation curve. In the present study, effect of age at calving and month of calving are not constant but can vary with the lactation stage. The gestation effect might have an important impact on the end of the lactation, creating large variations between pregnant and non-pregnant cows. In some studies, this effect was ignored, which might cause a larger increase of the residual variance at the end of the lactation.

Differences might also be explained by the size of the data sets. Indeed, in preliminary studies, we found that there was a large variation (in magnitude and shape) among genetic parameters obtained between our 10 distinct samples, each containing 80,000 TD records, for milk production in first lactation. Working simultaneously on these ten samples certainly improved our estimation of the genetic parameters and made it more reliable. This was confirmed by comparison of standard deviations of coefficients of random regressions obtained with one single sample and the pooling approach. The standard deviations of (co)variances were 1.16 to 3.19, 2.54 to 3.19 and 2.87 to 3.25 times larger with the single sample for the genetic, permanent environmental and herd-year variances, respectively. For the genetic effects, the standard deviations of the variances of the first three coefficients of the Legendre polynomial were more than three times larger with the single sample. For the genetic correlations presented in Table 3Go, the standard deviations obtained with the single sample were 1.25 to 2.84 times larger than with the pooling method. Smaller differences were noted for very close DIM with high correlation at the end of the lactation.

Regarding the selection strategy, if we give equal weight to each DIM in the objective, the estimated genetic parameters would result in an emphasis on test-day records in mid-lactation, after the peak. Early TD records would be given lower weight. This is especially useful for animals with production dropping in mid or late lactation.

In order to reduce the number of parameters and genetic values to be estimated with a fifth order Legendre polynomial, the eigenvectors of the obtained covariance matrices were calculated. In preliminary studies, it was found that the first three eigenvectors for both the genetic and permanent environmental parts were very similar when estimated under an unstructured model (with 10 by 10 unstructured covariance matrices, that corresponds to a multi-trait model) as well as under a fifth order random regression model on the 10 tests. The eigenvectors estimated in this study should be close to the ones we would have obtained with other methods.

Analysis of these eigenvalues and eigenvectors confirmed some previous studies (Van der Werf et al., 1998; Olori et al., 1999; Pool et al., 2000). For the genetic covariance matrix, the two first eigenvalues represented more than 98% of the total variation (91.6 and 6.6%, respectively). The associated eigenvectors (see Figure 5Go) represented approximately a constant term and a term varying linearly throughout the lactation. These terms seemed to make sense biologically as the first eigenvector might represent the average lactation potential of an animal and the second would be its persistency. For the permanent environmental effect, three eigenvectors were necessary to explain more than 95% of the total variation (78.3, 12.5, and 5.3%, respectively).



View larger version (9K):
[in this window]
[in a new window]
 
Figure 5. Coordinates of eigenvectors of the genetic covariance matrix: first eigenvector, second eigenvector.

 
Because these eigenvectors seem to make sense biologically, it is appealing to use them in the routine evaluation. The obtained breeding values would be independent and interpretable. In addition, use of the eigenvectors as proposed by Van der Werf et al. (1998) would result in fewer genetic parameters, fewer equations in the MME, diagonal variance matrices (as the eigentraits are independent) and better convergence properties for the solution of large MME. If such a strategy is implemented, the coefficients of the Legendre polynomial in the random regression would be replaced by the coefficients specified by the eigenvectors in Figure 5Go and equation [3]Go would be changed to:


([20])

where {nu}(m,t) and {omega}(m,t) are the value of the mth eigenvector of the genetic and permanent environment covariance matrix at DIM equal to t. For instance, {nu}(1,100) and {nu}(2,200) can be obtained from the information used to plot Figure 5Go and would be equal to 0.0617 and 0.0293, respectively.


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
The use of regression splines for modeling the fixed effects seemed appealing because their properties are a compromise between those of fixed classes and parametric curves. Regression splines require a limited number of parameters, have good flexibility, are smooth, and have limited sensitivity to the data. Furthermore, they can be relatively easily implemented in the MME.

Tools were developed to work with relatively large data sets and to model a continuous heterogeneity for the residual variance, ensuring better estimates of genetic parameters.

Models using eigenvectors seemed appealing because they can reduce the computational difficulty of the model and improve its convergence properties.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Tom Druet, who is Chargé de Recherches of the National Fund for Scientific Research (Brussels, Belgium), acknowledges his financial support. We are grateful to Anne Barbat, Bernard Bonaiti, Jean-Jacques Colleau, and Nicolas Gengler for valuable discussions. Ignacy Misztal and Shogo Tsuruta are gratefully acknowledged for sharing their programs. Finally, we express special thanks to Ian White, Robin Thompson, and Arthur Gilmour for discussions on splines.


    FOOTNOTES
 
1 On leave from Animal Science Unit, Gembloux Agricultural University, Gembloux B-5030, Belgium. Back

Corresponding author:
T. Druet; e-mail:
tom.druet{at}dga.jouy.inra.fr.

Received for publication December 23, 2002. Accepted for publication January 21, 2003.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 


Ali, T. E., and L. R. Schaeffer. 1987. Accounting for covariances among test day milk yields in dairy cows. Can. J. Anim. Sci. 67:637.

Auvray, B., and N. Gengler. 2002. Feasibility of a Walloon test-day model and study of its potential as tool for selection and management. INTERBULL Bull. 29:123–127.

Babb, J. S. 1986. Pooling maximum likelihood estimates of variance components obtained from subsets of unbalanced data. M. Sc. Thesis, Cornell University, Ithaca, NY.

Brotherstone, S., I. M. S. White, and K. Meyer. 2000. Genetic modelling of daily milk yield using orthogonal polynomials and parametric curves. Anim. Sci. 70:407–415.

De Roos, A. P. W., A. G. F. Harbers, and G. de Jong. 2002. Herd specific random regression curves in a test-day model for protein yield in dairy cattle. Proc. 7th World Congr. Genet. Appl. Livest. Prod., Montpellier, France. CD-ROM. Communication No. 01–05.

Ducrocq, V. 1993. Genetic parameters for type traits in the French Holstein breed based on a multiple-trait animal model. Livest. Prod. Sci. 36:143–156.

Gengler, N., and G. Wiggans. 2001. Heterogeneity in (co)variances structures of test-day yields. INTERBULL Bull. 27:179–184.

Gilmour, A. R., B. R. Cullis, S. J. Welham, and R. Thompson. 2000. ASREML Manual. New South Wales Dep. Agric., Orange, Australia.

Green, P. J., and B. W. Silverman. 1994. Nonparametric regression and generalized linear models. Chapman & Hall, London.

Jaffrezic, F., I. M. S. White, R. Thompson, and W. G. Hill. 2000. A link function approach to model heterogeneity of residual variances over time in lactation curve analyses. J. Dairy Sci. 83:1089–1093.[Abstract]

Jakobsen, J. H., P. Madsen, J. Jensen, J. Pedersen, L. G. Christensen, and D. A. Sorensen. 2002. Genetic parameters for milk production and persistency for Danish Holsteins estimated in random regression models using REML. J. Dairy Sci. 85:1607–1616.[Abstract]

Jamrozik, J., and L. R. Schaeffer. 1997. Estimates of genetic parameters for a test day model with random regressions for yield traits of first lactation Holsteins. J. Dairy Sci. 80:762–770.[Abstract/Free Full Text]

Jamrozik, J., L. R. Schaeffer, and F. Grignola. 1998. Genetic parameters for production traits and somatic cell score of Canadian Holsteins with multiple trait random regression model. Proc. 6th World Congr. Genet. Appl. Livest. Prod., Armidale, Australia XXIII:303–306.

Jensen, J. 2001. Genetic evaluation of dairy cattle using test-day models. J. Dairy Sci. 84:2803–2812.[Abstract]

Jensen, J., E. A. Mäntysaari, P. Madsen, and R. Thompson. 1996. Residual maximum likelihood estimation of (co)variance components in multivariate mixed linear models using average information. J. of the Indian Society of Agricultural Statistics. 215–236.

Kettunen, A., E. A. Mäntysaari, I. Strandén, J. Pöso, and M. Lidauer. 1998. Estimation of genetic parameters for first lactation test day milk production using random regression models. Proc. 6th World Congr. Genet. Appl. Livest. Prod., Armidale, Australia XXIII:307–310.

Kirkpatrick, M., D. Lofsvold, and M. Bulmer. 1990. Analysis of inheritance, selection and evolution of growth trajectories. Genetics 124:979–993.[Abstract]

Liu, Z., F. Reinhardt, and R. Reents. 2000. Estimating parameters of a random regression test day model for first three lactation milk production traits using the covariance function approach. INTERBULL Bull. 25:74–80.

Mayeres, P. 2002. Appui technique à la gestion des troupeaux de bovins laitiers à travers la mise en évidence d’influences spécifiques au niveau troupeaux et des vaches sur les performances et l’efficacité biologique. Master thesis, Gembloux Agricultural University, Gembloux, Belgium.

Meyer, K., H.-U. Graser, and K. Hammond. 1989. Estimates of genetic parameters for first lactation test day production of Australian Black and White cows. Livest. Prod. Sci. 21:177–199.

Meyer, K. 1998. Estimating covariance functions for longitudinal data using a random regression model. Genet. Sel. Evol. 30:221–240.

Meyer, K. 2001. Estimates of direct and maternal covariance functions for growth of Australian beef calves from birth to weaning. Genet. Sel. Evol. 33:487–514.[Medline]

Misztal, I., T. Strabel, J. Jamrozik, E. A. Mäntysaari, and T. H. E. Meuwissen. 2000. Strategies for estimating the parameters needed for different test-day models. J. Dairy Sci. 83:1125–1134.[Abstract]

Misztal, I., S. Tsuruta, T. Strabel, B. Auvray, T. Druet, and D. H. Lee. 2002. BLUF90 and related programs (BGF90). Proc. 7th World Congr. Genet. Appl. Livest. Prod., Montpellier, France. CD-ROM. Communication No. 28–07.

Oehlert, G. W. 1992. A note on the delta method. American Statistician 46:27–29.

Olori, V. E., S. Brotherstone, W. G. Hill, B. J. McGuirk. 1997. Effect of gestation stage on milk yield and composition in Holstein Friesian dairy cattle. Livest. Prod. Sci. 52:167–176.

Olori, V. E., W. G. Hill, B. J. McGuirk, and S. Brotherstone. 1999. Estimating variance components for test day milk records by restricted maximum likelihood with a random regression animal model. Livest. Prod. Sci. 61:53–63.

Pander, B. L., W. G. Hill, and R. Thompson. 1992. Genetic parameters of test day records of British Holstein-Friesian heifers. Anim. Prod. 53:11–21.

Pool, M. H., and T. H. E. Meuwissen. 2000. Reduction of the number of parameters needed for a polynomial random regression test day model. Livest. Prod. Sci. 64:133–145.[Medline]

Pool, M. H., L. L. G. Janss, and T. H. E. Meuwissen. 2000. Genetic parameters of Legendre polynomials for first parity lactation curves. J. Dairy Sci. 83:2640–2649.[Abstract]

Rekaya, R., M. J. Carabaño, and M. A. Toro. 1999. Use of test day yields for the genetic evaluation of production traits in Holstein-Friesian cattle. Livest. Prod. Sci. 57:203–217.

Samoré, A. B., P. Boettcher, J. Jamrozik, A. Bagnato, and A. F. Groen. 2002. Genetic parameters for production traits and somatic cell scores estimated with a multiple trait random regression model in Italian Holsteins. Proc. 7th World Congr. Genet. Appl. Livest. Prod., Montpellier, France. CD-ROM. Communication No. 01–07.

Smith, J. W., and J. E. Legates. 1962. Relation of days open and days dry to lactation milk and fat yields. J. Dairy Sci. 45:1192–1198.[Abstract/Free Full Text]

Strabel, T., and I. Misztal. 1999. Genetic parameters for first and second lactation milk yields of Polish Black and White cattle with random regression test-day models. J. Dairy Sci. 82:2805–2810.[Abstract]

Swalve, H. H. 1995. The effect of test day models on the estimation of genetic parameters and breeding values for dairy yield traits. J. Dairy Sci. 78:929–938.[Abstract]

Van der Werf, J. H. J., M. E. Goddard, and K. Meyer. 1998. The use of covariance functions and random regressions for genetic evaluation of milk production based on test day records. J. Dairy Sci. 81:3300–3308.[Abstract]

White, I. M. S., R. Thompson, and S. Brotherstone. 1999. Genetic and environmental smoothing of lactation curves with cubic splines. J. Dairy Sci. 82:632–638.[Abstract]

Wilmink, J. B. M. 1987. Adjustment of test-day milk, fat, and protein yields for age season and stage of lactation. Livest. Prod. Sci. 16:335–348.

Yerex, R. P. 1988. Pooling restricted maximum likelihood estimates from data subsets under the animal model. Ph.D. thesis, Cornell University, Ithaca, NY.


This article has been cited by other articles:


Home page
J DAIRY SCIHome page
I. Aguilar, I. Misztal, and S. Tsuruta
Genetic components of heat stress for dairy cattle with multiple lactations
J Dairy Sci, November 1, 2009; 92(11): 5702 - 5711.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
J. Bohmanova, J. Jamrozik, and F. Miglior
Effect of pregnancy on production traits of Canadian Holstein cows
J Dairy Sci, June 1, 2009; 92(6): 2947 - 2959.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
E. Santellano-Estrada, C. M. Becerril-Perez, J. de Alba, Y. M. Chang, D. Gianola, G. Torres-Hernandez, and R. Ramirez-Valverde
Inferring Genetic Parameters of Lactation in Tropical Milking Criollo Cattle with Random Regression Test-Day Models
J Dairy Sci, November 1, 2008; 91(11): 4393 - 4400.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
H. Soyeurt, P. Dardenne, F. Dehareng, C. Bastin, and N. Gengler
Genetic Parameters of Saturated and Monounsaturated Fatty Acid Content and the Ratio of Saturated to Unsaturated Fatty Acids in Bovine Milk
J Dairy Sci, September 1, 2008; 91(9): 3611 - 3626.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
J. Bohmanova, F. Miglior, J. Jamrozik, I. Misztal, and P. G. Sullivan
Comparison of Random Regression Models with Legendre Polynomials and Linear Splines for Production Traits and Somatic Cell Score of Canadian Holstein Cows
J Dairy Sci, September 1, 2008; 91(9): 3627 - 3638.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
H. Leclerc, M. Wensch-Dorendorf, J. Wensch, V. Ducrocq, and H. H. Swalve
A General Method to Validate Breeding Value Prediction Software
J Dairy Sci, August 1, 2008; 91(8): 3179 - 3183.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
H. Hammami, B. Rekik, H. Soyeurt, A. Ben Gara, and N. Gengler
Genetic Parameters for Tunisian Holsteins Using a Test-Day Random Regression Model
J Dairy Sci, May 1, 2008; 91(5): 2118 - 2126.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
E. Negussie, I. Stranden, and E. A. Mantysaari
Genetic Association of Clinical Mastitis with Test-Day Somatic Cell Score and Milk Yield During First Lactation of Finnish Ayrshire Cows
J Dairy Sci, March 1, 2008; 91(3): 1189 - 1197.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
K. Togashi and C. Y. Lin
Genetic Modification of the Lactation Curve by Bending the Eigenvectors of the Additive Genetic Random Regression Coefficient Matrix
J Dairy Sci, December 1, 2007; 90(12): 5753 - 5758.
[Abstract] [Full Text] [PDF]


Home page
J ANIM SCIHome page
A. Molina, A. Menendez-Buxadera, M. Valera, and J. M. Serradilla
Random regression model of growth during the first three months of age in Spanish Merino sheep
J Anim Sci, November 1, 2007; 85(11): 2830 - 2839.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
J. Tarres, Z. Liu, V. Ducrocq, F. Reinhardt, and R. Reents
Validation of an Approximate REML Algorithm for Parameter Estimation in a Multitrait, Multiple Across-Country Evaluation Model: A Simulation Study
J Dairy Sci, October 1, 2007; 90(10): 4846 - 4855.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
K. Togashi and C. Y. Lin
Selection for Milk Production and Persistency Using Eigenvectors of the Random Regression Coefficient Matrix
J Dairy Sci, December 1, 2006; 89(12): 4866 - 4873.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
T. Strabel and J. Jamrozik
Genetic analysis of milk production traits of polish black and white cattle using large-scale random regression test-day models.
J Dairy Sci, August 1, 2006; 89(8): 3152 - 3163.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
N. P. P. Macciotta, D. Vicario, and A. Cappio-Borlino
Use of multivariate analysis to extract latent variables related to level of production and lactation persistency in dairy cattle.
J Dairy Sci, August 1, 2006; 89(8): 3188 - 3194.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
H. Leclerc, S. Minery, I. Delaunay, T. Druet, W. F. Fikse, and V. Ducrocq
Estimation of Genetic Correlations Among Countries in International Dairy Sire Evaluations with Structural Models
J Dairy Sci, May 1, 2006; 89(5): 1792 - 1803.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
B. Karacaoren, F. Jaffrezic, and H. N. Kadarmideen
Genetic Parameters for Functional Traits in Dairy Cattle from Daily Random Regression Models
J Dairy Sci, February 1, 2006; 89(2): 791 - 798.
[Abstract] [Full Text] [PDF]


Home page
J ANIM SCIHome page
K. R. Robbins, I. Misztal, and J. K. Bertrand
Joint longitudinal modeling of age of dam and age of animal for growth traits in beef cattle
J Anim Sci, December 1, 2005; 83(12): 2736 - 2742.
[Abstract] [Full Text] [PDF]


Home page
J ANIM SCIHome page
R. J. C. Cantet, A. N. Birchmeier, A. W. C. Cayo, and C. Fioretti
Semiparametric animal models via penalized splines as alternatives to models with contemporary groups
J Anim Sci, November 1, 2005; 83(11): 2482 - 2494.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
E. Wall, M. P. Coffey, and S. Brotherstone
Body Trait Profiles in Holstein-Friesians Modeled Using Random Regression
J Dairy Sci, October 1, 2005; 88(10): 3663 - 3671.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
T. Strabel, J. Szyda, E. Ptak, and J. Jamrozik
Comparison of Random Regression Test-Day Models for Polish Black and White Cattle
J Dairy Sci, October 1, 2005; 88(10): 3688 - 3699.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
N. Gengler, G. R. Wiggans, and A. Gillon
Adjustment for Heterogeneous Covariance due to Herd Milk Yield by Transformation of Test-Day Random Regressions
J Dairy Sci, August 1, 2005; 88(8): 2981 - 2990.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
N. P. P. Macciotta, D. Vicario, and A. Cappio-Borlino
Detection of Different Shapes of Lactation Curve for Milk Yield in Dairy Cattle by Empirical Mathematical Models
J Dairy Sci, March 1, 2005; 88(3): 1178 - 1191.
[Abstract] [Full Text] [PDF]


Home page
J ANIM SCIHome page
J. Bohmanova, I. Misztal, and J. K. Bertrand
Studies on multiple trait and random regression models for genetic evaluation of beef cattle for growth
J Anim Sci, January 1, 2005; 83(1): 62 - 67.
[Abstract] [Full Text] [PDF]


Home page
J ANIM SCIHome page
F. Jaffrezic, E. Venot, D. Laloe, A. Vinet, and G. Renand
Use of structured antedependence models for the genetic analysis of growth curves
J Anim Sci, December 1, 2004; 82(12): 3465 - 3473.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Druet, T.
Right arrow Articles by Ducrocq, V.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Druet, T.
Right arrow Articles by Ducrocq, V.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS