JDS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Schneider, M. d. P.
Right arrow Articles by Roth, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Schneider, M. d. P.
Right arrow Articles by Roth, A.
J. Dairy Sci. 89:4903-4906
© American Dairy Science Association, 2006.

Short Communication: Genetic Evaluation of the Interval from First to Last Insemination with Survival Analysis and Linear Models

M. del P. Schneider*,1, E. Strandberg*, V. Ducrocq{dagger} and A. Roth{ddagger}

* Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences (SLU), P.O. Box 7023, SE-75007 Uppsala, Sweden
{dagger} Station de Génétique Quantitative et Appliquée, Institut National de la Recherche Agronomique, 78352 Jouy-en-Josas, France
{ddagger} Swedish Dairy Association, P.O. Box 210, SE-10124 Stockholm, Sweden

1 Corresponding author: Pilar.Schneider{at}hgen.slu.se


    ABSTRACT
 TOP
 ABSTRACT
 ACKNOWLEDGEMENTS
 REFERENCES
 
Sire breeding values for the interval between the first and last insemination were predicted using 4 proportional hazards models (survival analyses) and 2 linear mixed models to determine which would result in a more accurate genetic evaluation. A stochastic simulation describing the reproductive cycle of first-parity cows was conducted, in which true breeding values for conception rate were created. The model included the effects of sire and herd. The highest correlations between true breeding values for conception rate and breeding values for the interval between first and last insemination predicted by the survival analysis model and the linear model were 0.803 and 0.744, respectively. The results showed that when pregnancy status was known, survival models were more accurate than linear models to predict breeding values for conception rate when using observations on the interval between first and last insemination.

Key Words: female fertility • genetic evaluation • survival analysis

A problem related to the analysis of fertility traits (interval or continuous variables) is how to handle the information on nonpregnant cows. The linear model (LM), which is currently the approach often used to predict breeding values for fertility traits, is not well suited for handling censored observations. The practical solutions include 1) handling records from pregnant and nonpregnant cows in the same way, as is commonly done for the interval from calving to last insemination; 2) excluding records of nonpregnant cows, as is usually done for calving interval; and 3) extending records by projection. The advantage of applying a survival analysis (SA) to study fertility traits is that information is retained from cows that are not pregnant or that are culled before conception. Thus, records from pregnant (uncensored) and nonpregnant cows (censored) can be treated jointly and included in the analysis, making proper use of all the available information. Studies have shown that the SA is a better method than the LM for genetic evaluation of conception rate when observations on the interval between calving and last insemination (CLI) are used and the pregnancy status is known (Schneider et al., 2005). The interval between first and last insemination (FLI) is an alternative trait sometimes used in genetic evaluations of fertility in dairy cattle (Jorjani, 2005), and is mainly a measure of conception rate: The higher the conception rate, the shorter the FLI. This trait does not contain the interval from calving to first insemination, which is largely influenced by decision making by the farmers; thus, it should represent the actual conception rate better than the CLI. However, the FLI has a special distribution (Figure 1Go) in which approximately 50% of the cows conceived at the first insemination and the other cows potentially conceived at intervals of 21 d, on average. This distribution may create analytical problems. The objective of this study was to investigate by simulation whether the analysis of the FLI using SA would result in a more accurate genetic evaluation for conception rate than the LM.


Figure 1
View larger version (15K):
[in this window]
[in a new window]
 
Figure 1. Frequency distribution for the interval between first and last insemination from one replicate. Note that cows conceiving at first insemination (30,312 observations) are not shown.

 
The simulation process described by Schneider et al. (2005) was applied to create phenotypic observations for the FLI. The FLI was defined as the interval from first insemination to conception (at last known insemination) or censoring (at last known insemination). For cows never detected in heat, and thus never inseminated, the FLI was equal to the maximum allowed insemination period assigned for each cow (approximately 8% of the records). To be culled, the cow either had had her maximum number of inseminations without becoming pregnant or the maximum waiting period had been exhausted without her becoming pregnant (approximately 9 inseminations and 288 d, respectively). The same trait definition was used for both the LM and SA. In the SA, records of pregnant cows were considered as uncensored, and those for nonpregnant cows were considered as censored. In the LM analysis, pregnant cows were not distinguished from nonpregnant cows. Two different daughter group sizes were studied for all models, with 150 (SD 12.3, ranging from 104 to 201) and 60 (SD 7.8, ranging from 31 to 92) daughters per sire (400 sires). Herd size was fixed at 50 or 20, resulting in 60,000 or 24,000 cows, respectively.

The same model was applied for both approaches and included the random effects of sire and herd. Sire effects were assumed to follow a normal distribution with variance {sigma}2s. No relationship among sires was assumed. The effect of herd was assumed to follow a log-gamma ({gamma}) distribution and a normal distribution in the survival and linear models, respectively. Using the SA, we considered 4 proportional hazards models: a Cox model, a standard Weibull model, a grouped data model (for details about the grouped data survival models see Ducrocq, 1999), and a Weibull model in which all the records were treated as uncensored. The final model was studied to evaluate the effect of censoring in proportional hazards models. In the grouped data model, the FLI was divided into 9 periods (discrete scale): 1 d; 2 to 31 d; 32 to 52 d; 53 to 73 d; 74 to 94 d; 95 to 115 d; 116 to 136 d; 137 to 157 d; and ≥158 d after first insemination, respectively. These periods represent the first, second, and later inseminations.

For the LM analysis, 2 models were applied: L1, to analyze the FLI, and L2, to analyze the log transformation of the trait. Survival Kit V3.12 (Ducrocq and Sölkner, 1998) and DMU (Jensen and Madsen, 1994) were used for the SA and LM analyses, respectively. Variance components and heritabilities were estimated in both analyses. The model comparison criterion was the correlation between true breeding values for conception rate (TBVCR) and predicted breeding values for FLI (PBV) as in Schneider et al. (2005). The accuracy of selection (rTI) was calculated as rTI = Formula, where n is the number of daughters and k = (4 –h2)/h2. All results presented are based on 50 replicates of the simulation.

The average FLI was 26.67 d (SD 35.74) and the average proportion of censored records was 15.3% (±0.06%). The estimated Weibull parameter {rho} was 0.53 for the standard Weibull model.

Table 1Go shows the correlations between TBVCR and breeding values predicted by the different models for both progeny groups. Correlations between TBVCR and sire breeding values predicted by the SA Models S1 to S3 were higher than the corresponding correlations from the LM. Although the correlations between TBVCR and PBV for the small progeny group were smaller than the correlations obtained for the large progeny group, the same pattern was observed. The PBV from the grouped data model and the Weibull model had the highest correlations with TBVCR.


View this table:
[in this window]
[in a new window]
 
Table 1. Correlations (r) between true breeding values for conception rate and breeding values predicted by survival analysis and linear models, accuracies (rTI) calculated from the estimated heritabilities, and number of daughters and estimated heritabilities (h2)1
 
To test whether a Weibull distribution properly fit the data, the log of minus the log of the Kaplan–Meier estimate (nonparametric) of the survivor curves [S(t)] were plotted against the log of time (Figure 2Go). The relationship resulted in approximately a straight line after a value of ln(t) = 3.5. This means that the assumption that the Weibull distribution fits the data holds only after the first month [ln(t) = 3.5 ->t ~33 d]. Therefore, it was surprising that the Weibull model worked so well and resulted in such a high correlation between the TBVCR and PBV. Based on the distribution of survival times, we expected the Cox model to have a better fit than the Weibull, and thus a higher correlation with TBVCR, because one of its features is that no assumption is made about the form of the baseline hazard function. One likely explanation for the poor fit of the Cox model is the existence of many ties (i.e., equal failure times), especially for all those cows that get pregnant at first insemination, a feature that Cox models handle poorly. The advantage of the grouped data model is that it requires no particular assumption about the shape of the baseline distribution and overcomes the problem of many ties. Thus, the grouped data model is expected to be the most suitable to estimate breeding values for conception rate, although the Weibull model seemed quite robust to correctly predict breeding values with the given distribution of FLI and the deviation of the Weibull assumption.


Figure 2
View larger version (6K):
[in this window]
[in a new window]
 
Figure 2. Log of minus the log of Kaplan–Meier estimates plotted against the log of time for the interval between the first and last insemination.

 
For the LM, the log transformation of FLI (L2) had the highest correlation between TBVCR and PBV. The L2 gave a small improvement over L1. In the present study, an LM in which the information of nonpregnant cows was excluded was not considered, because it has previously been shown that keeping the information of nonpregnant cows yields a more precise genetic evaluation (Schneider et al., 2005).

Heritabilities and accuracies of selection (rTI) calculated from the estimated heritabilities and number of daughters are presented in Table 1Go. In theory, and given the data structure, the accuracy (rTI) and the correlation between TBVCR and PBV (which can be seen as the true accuracy) should be equal if the model is correct. In the large progeny group, the grouped data model and the Weibull model showed good agreement between the calculated accuracies and correlations between TBVCR and PBV. This indicates that heritabilities were estimated correctly for these models. However, this was not the case for the Cox model and the uncensored model.

For the L1 and L2, the accuracies and correlations between TBVCR and PBV differed: Accuracies were overestimated. For Models L1 and L2, heritabilities were therefore not correctly estimated, most likely due to the skewed distribution.

In the small progeny group, the grouped data model had the best agreement between the accuracy and the correlation between TBVCR and PBV. However, for the Weibull model the accuracy and correlation between TBVCR and PBV differed: Variance components, for sire in particular, were overestimated. The Cox model had a low accuracy: Variance components were poorly estimated (sire variance was underestimated). For L1, the accuracy and correlation between TBVCR and PBV differed, but L2 showed good agreement.

The results demonstrated that the SA is a more accurate approach than the LM to predict the genetic merit of bulls for conception rate when using observations on the FLI. If selection were carried out on these PBV, this would also translate into 8% greater genetic progress (the difference in the accuracies between the group model and L2). The main advantage of the SA is the ability to account for censoring. When all the records were treated as uncensored, correlations similar to the (best) LM were obtained. The SA makes proper use of information that would otherwise be discarded or treated as uncensored.

Some drawbacks in the application of the SA for the analysis of fertility traits must be noted. First, the SA is not easily applicable in a multiple-trait analysis (e.g., production or other fertility traits). However, approximations have been proposed for large-scale applications (Ducrocq et al., 2001; Tarrés et al., 2006), and Damgaard and Korsgaard (2006) have used a Bayesian approach with Gibbs sampling to show that survival traits and normally distributed continuous traits or threshold traits can be analyzed together in a multiple-trait model. Second, the use of an animal model is currently computationally impossible to implement in large applications. However, technically it is possible to use an animal model to estimate breeding values with the Survival Kit if variance components are assumed known (Ducrocq, 2005).

To take full advantage of the SA, one must have information on conception so that censoring can be correctly specified. Information on fertility, actual voluntary waiting period, service period, and pregnancy status are expected to be more accurately recorded in the future. Better quality data and the use of the SA can be expected to give more than 8% greater genetic response than the use of the LM. However, more research is needed using field data and incorporating other effects into the model.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 ACKNOWLEDGEMENTS
 REFERENCES
 
This study was partly financed by the Swedish Farmers’ Foundation for Agricultural Research (Stockholm, Sweden) and the SLU research theme "Animal Welfare for Quality in Food Production."

Received for publication April 27, 2006. Accepted for publication August 1, 2006.


    REFERENCES
 TOP
 ABSTRACT
 ACKNOWLEDGEMENTS
 REFERENCES
 


Damgaard, L. H., and I. R. Korsgaard. 2006. A bivariate quantitative genetic model for a linear Gaussian trait and a survival trait. Genet. Sel. Evol. 38:45–64.[Medline]

Ducrocq, V. 1999. Extension of survival analysis to discrete measures of longevity. Proc. Int. Workshop on EU Concerted Action Genetic Improvement of Functional Traits in Cattle (GIFT)—Longevity, Jouy-en-Josas, France. Interbull Bull. 21:41–47.

Ducrocq, V. 2005. An improved model for the French genetic evaluation of dairy bulls on length of productive life of their daughters. Anim. Sci. 80:249–256.

Ducrocq, V., D. Boichard, A. Barbat, and H. Larroque. 2001. Implementation of an approximate multitrait BLUP evaluation to combine production traits and functional traits into a total merit index. Page 2 in Proc. 52nd EAAP Mtg., Budapest, Hungary. EAAP, Rome, Italy.

Ducrocq, V., and J. Sölkner. 1998. The Survival Kit—A Fortran package for the analysis of survival data. Proc. 6th World Congr. Genet. Appl. Livest. Prod., Armidale, Australia, 27:447–448.

Jensen, J., and P. Madsen. 1994. DMU: A package for the analysis of multivariate mixed models. Proc. of 5th World Congr. Genet. Appl. Livest. Prod. 22:45–46. University of Guelph, Guelph, Ontario, Canada.

Jorjani, H. 2005. Preliminary report of Interbull pilot study for female fertility traits in Holstein populations. Proc. Int. 2005 Interbull Mtg., Uppsala, Sweden. Interbull Bull. 33:34–44.

Schneider, M. del P., E. Strandberg, V. Ducrocq, and A. Roth. 2005. Survival analysis applied to genetic evaluation for female fertility in dairy cattle. J. Dairy Sci. 88:2253–2259.[CU1][Abstract/Free Full Text]

Tarrés, J., J. Piedrafita, and V. Ducrocq. 2006. Validation of an approximate approach to compute genetic correlations between longevity and linear traits. Genet. Sel. Evol. 38:65–83.[Medline]


This article has been cited by other articles:


Home page
J DAIRY SCIHome page
C. Sun, P. Madsen, U. S. Nielsen, Y. Zhang, M. S. Lund, and G. Su
Comparison between a sire model and an animal model for genetic evaluation of fertility traits in Danish Holstein population
J Dairy Sci, August 1, 2009; 92(8): 4063 - 4071.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
Y. Hou, P. Madsen, R. Labouriau, Y. Zhang, M. S. Lund, and G. Su
Genetic analysis of days from calving to first insemination and days open in Danish Holsteins using different models and censoring scenarios
J Dairy Sci, March 1, 2009; 92(3): 1229 - 1239.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Schneider, M. d. P.
Right arrow Articles by Roth, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Schneider, M. d. P.
Right arrow Articles by Roth, A.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS