|
|
||||||||
Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, PO Box 7023, SE-750 07 Uppsala, Sweden
Corresponding author: E. Carlén; e-mail: Emma.Carlen{at}hgen.slu.se.
| ABSTRACT |
|---|
|
|
|---|
Key Words: dairy cattle clinical mastitis linear model survival analysis
Abbreviation key: LM = linear model, MAST = mastitis (0/1), PNA = proportion of North American Holstein genes, SA = survival analysis, TFM = time to first mastitis or censoring
| INTRODUCTION |
|---|
|
|
|---|
Currently, the most common method for genetic evaluation of clinical mastitis is linear model (LM) methodology, where mastitis is defined as a binary trait distinguishing between cows with at least one case (assigned 1) and cows with no cases of mastitis (assigned 0) within a defined period of the lactation. The heritability of clinical mastitis analyzed with LM is low (Pösö and Mäntysaari, 1996; Rupp and Boichard, 1999; Lassen et al., 2003; Carlén et al., 2004). This can mainly be explained by large environmental effects but also partly by the all-or-none character of the trait, which results in a low observed variation among cows.
Survival analysis (SA), also known as failure-time or event-time analysis, is a statistical method for studying the occurrence and timing of specific events, where the analyzed response time equals the time elapsed from a starting point until the occurrence of the event of interest (Ducrocq, 1987). Within the field of animal genetics, SA has mainly been used for analyzing dairy cattle longevity traits (Smith and Quaas, 1984; Ducrocq, 1994; Vollema and Groen, 1998), and many countries currently use this method in routine genetic evaluations for longevity (Interbull, 2003). One advantage of SA when analyzing longevity is that information not only from culled animals (uncensored records), but also from animals still alive (censored records) is included in the analysis. Although the exact length of the lifespans of currently living cows is unknown, SA allows one to use the information that they survived at least until a certain age. With other methods for analyzing direct measures of longevity, cows still alive at the end of study period cause a problem because their actual life span is unknown. These records must usually be excluded from analysis to avoid biased results (Smith and Quaas, 1984; Ducrocq et al., 1988).
Genetic studies of the time to first outbreak of a particular disease by the use of SA or similar methodology have so far been limited. Saebo et al. (2002) analyzed time to first mastitis treatment on a relatively small data set of the first 5 lactations in Norwegian cattle, with both a stochastic process model and a semiparametric proportional hazard model. Those researchers concluded that the stochastic process model seemed to be better for later stages of lactation, whereas the semiparametric model was superior around calving.
They were also able to identify sires with daughters showing highest resistance to mastitis. They did not compare these models with the LM. Hirst et al. (2002) studied time to occurrence of lameness in dairy cattle.
One of the drawbacks with the traditional LM methodology for analyzing clinical mastitis is the low observed variation among cows, because of the all-or-none character of the trait. With this method, no difference is observable between cows getting mastitis at the beginning or close to the end of the defined period, or between cows without mastitis cases that were at risk for infection for different periods. Furthermore, cows culled for another reason before they got a chance to express mastitis cannot be distinguished from cows that did not contract mastitis during the whole period. This is true, for example, in the Swedish national genetic evaluation for dairy cattle, where cows culled before 150 d for other reasons than mastitis are assigned 0. Consequently, the amount of information used is reduced when mastitis is analyzed as an all-or-none trait with a LM.
Considering the advantages of SA for traits with a longitudinal character, improved efficiency is also expected for mastitis data when the trait time to first mastitis or censoring (TFM) is analyzed. The observed variation among cows and among sires increases when TFM is modeled, as cows with a mastitis case would not automatically get the same value. A longer period before the first outbreak of mastitis can be interpreted as a higher resistance against mastitis. In SA, cows without cases are treated as censored records. We hereby include the information that these cows did not contract mastitis until the time of censoring; after this point, we have no more information. Although a cow has no cases, a longer period without mastitis can be considered advantageous. Records from cows culled (for reasons other than but possibly correlated to mastitis) before they got a chance to express mastitis also become censored, which reduces the potential bias occurring when treating these cows as healthy with traditional methodology. In SA, we only consider them healthy, in regard to mastitis, until the time they were culled.
The objective of this study was to investigate whether time to first case of clinical mastitis could be successfully analyzed using SA and to compare the precision of predicted breeding values from that analysis to the precision from a traditional LM.
| MATERIALS AND METHODS |
|---|
|
|
|---|
The structure of the analyzed data for lactations 1 to 3 is shown in Table 1
. Pedigree information of the sires of cows in the data set was traced back as far as possible, resulting in a sire pedigree file with 1139 bulls, including the sires with daughter records.
|
For SA, the observation of a cow with a case of mastitis was considered as uncensored, and TFM was defined as the number of days from 10 d before calving to the day of the first treatment of mastitis or culling because of mastitis. For a healthy cow, i.e., a cow without a case of mastitis, the observation was considered as right censored. For these cows, time was defined as the number of days from 10 d before calving until 1) the day of next calving, 2) the day of culling for other reasons than mastitis, 3) the day of movement to a new herd, or 4) lactation d 240. The latter figure was based on the fact that the risk of culling for a cow increases approximately around that number of days after calving. When TFM is analyzed, it is no longer necessary to use the restricted period of 150 d into lactation (as done in LM to avoid bias caused by culling) because cows culled are indicated by the censoring variable. Here, the period was restricted to a maximum of 700 d based on the fact that the average calving interval was around 400 d and most calvings had occurred within 700 d after the previous calving.
Statistical Analysis
As a preliminary analysis for SA, a Cox proportional hazard model (semiparametric) was run, and Kaplan-Meier curves were created to check whether data followed the Weibull distribution. The goal of this analysis was to test the adequacy of applying a Weibull proportional hazard model (fully parametric). The assumption was assessed graphically from plots of logs of the baseline survivor function (S (t)), against logs of time (i.e., ln[lnS(t)] against ln t). If the Weibull assumption holds, the resulting graph should be linear.
Two different statistical models were used in the main analysis: an LM based on binary data and a Weibull proportional hazard model. For a better comparison of the 2 methods, the same fixed and random effects were used in both models. Each parity was analyzed separately for both models. For the mixed LM analysis, the following sire model was used to analyze MAST:
![]() | ([1]) |
where yijklm is the observation of mastitis (0 = healthy; 1 = diseased) of cow m; ymi is the fixed effect of ith year by month at calving; agej is the fixed effect of age j in months at calving (1 mo per class); hyk is the random effect of kth herd by year of calving; sl is the random effect of sire l; b1 is the fixed regression coefficient on proportion heterosis of animal m (Hetm); b2 is the fixed regression coefficient on PNA of animal m (Holm); and eijklm is the random residual effect. Random effects were assumed to be normally distributed with zero means and variances of I
hy2, A
s2, and I
e2, respectively, where A is the additive relationship matrix and I is an identity matrix.
The DMU package (Madsen and Jensen, 2000) was used to obtain REML estimates of the variance components. The heritability was calculated as
![]() | ([2]) |
Because the herd-year variance was not included in the phenotypic variance, the heritabilities were comparable with previous studies where the effect of herd-year mostly was treated as fixed. The accuracy in selection was calculated according to
![]() | ([3]) |
where n is the number of daughters and
For the SA, the following Weibull proportional hazard model was used to analyze TFM:
![]() | ([4]) |
where
ijkl(t) is the hazard of a cow getting mastitis at time t given that it has not occurred prior to t and
0(t) is the Weibull baseline hazard function (
(
t)
1) with scale parameter
and shape parameter
. A value of
< 1 indicates that the hazard decreases with time, whereas
> 1 means that the hazard increases with time. The other effects, all time-independent, are as described for model [1]. The herd-year effect was assumed to follow a log-gamma distribution and was integrated out from the joint posterior density. The Weibull parameter
was estimated in the analyses.
Survival Kit V3.12 (Ducrocq and Sölkner, 1998) was used to estimate variance components for sire and herd-year. The heritability was calculated as
![]() | ([5]) |
where p is the proportion of uncensored records.
This derivation for the heritability on the original scale, which is not dependent on the Weibull parameters, was suggested by Yazdi et al. (2002) as the equivalent heritability. Those researchers showed very close agreement between accuracy of selection and selection response calculated using hequ2 and observed accuracies and responses calculated from simulation. The term equivalent refers to the fact that the proof of a sire with n daughters would get the same reliability as if it were evaluated on a linear trait with this heritability. An increase in the proportion of uncensored records with time implies that the equivalent heritability increases with time until it reaches the theoretical heritability [h2 = 4
s2/(
s2 + 1)] one would get in the total absence of censoring. When accuracy is calculated, the actual number of daughters should be used in [3], the amount of censoring is already accounted for in the definition of hequ2.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
The incidence of MAST increased with parity, with mean values of 10, 12, and 15% for lactations 1 to 3, respectively. The increase in frequency was expected and corresponds to previous literature (Pösö and Mäntysaari, 1996; Barkema et al., 1998). In the SA, the proportions of uncensored records, i.e., cows with mastitis, were about 15, 18, and 22% for lactations 1 to 3, respectively. These figures, which also measure the number of daughters with at least one case, were expected to be somewhat higher than the frequency of MAST because TFM was measured during a longer period (up to 700 d compared with 160 d). The average failure times, i.e., the number of days from 10 d before calving until first mastitis case, were 123, 136, and 127 d for cows in lactations 1 to 3, respectively. For censored cows, the average censoring times were 364, 356, and 341 d, respectively.
Estimates of the Weibull Parameter
The estimated values of the Weibull parameter
were 0.57, 0.70, and 0.67 for lactations 1 to 3, respectively, which indicates that the risk of getting mastitis decreased with time within lactation. This corresponds to previous literature where the highest frequency of mastitis has been reported around calving and in early lactation, and the frequency in the remainder of the lactation has been low and nearly constant (Barkema et al., 1998; Heringstad et al., 1999). In our data, about 20 to 30% of all mastitis cases occurred before d 10 in lactation, with the highest values for first parity cows. If a time-dependent lactation stage effect had been fitted, that effect would be expected to account for the early high risk, and the
would be expected to increase. However, to be able to compare the same model for both LM and SA, such a time-dependent effect was not fitted in this study.
Heritabilities and Accuracies
Estimates of heritabilities and accuracies for MAST and TFM are provided in Table 2
. The heritabilities of MAST (0.032 and 0.014 for first and later lactations, respectively) are in the range of reported estimates from other studies using LM. In a review by Heringstad et al. (2000), estimates of heritabilities of clinical mastitis from 13 studies based on Nordic data were between 0.001 and 0.06, with most values falling in the interval from 0.02 to 0.03. Other estimates reported for first lactation range from 0.02 to 0.06 (Rupp and Boichard, 1999; Sørensen et al., 2000; Hansen et al., 2002; Lassen et al., 2003; Carlén et al., 2004). Few studies have taken later parities into account, and results are inconsistent. Pösö and Mäntysaari (1996) found higher heritabilities for lactations 2 and 3 in comparison with lactation 1, whereas Nielsen et al. (1997) did not find any differences in estimates between lactations. Heritability estimates on the linear scale are, however, influenced by frequency level, and estimates from different studies are, therefore, not easily comparable (Emanuelson, 1988; Heringstad et al., 2000).
|
In our study, the heritability for both traits decreased with increasing lactation number. This was mainly an effect of increasing residual variances for MAST and decreasing sire variances for both MAST and TFM. The lower heritability for later lactations could also partly be explained by culling in first (and second) parity.
The accuracy of predicted breeding values for the first lactation was only slightly higher for TFM (0.76) than for MAST (0.74), whereas it was considerably higher for later lactations. Nevertheless, if everything else is constant, a 2% increase in accuracy will give a 2% increase in genetic gain. Based on the increased accuracies, a gain in genetic progress would be expected by analyzing mastitis with SA instead of with LM, especially for later parities.
Although the difference in accuracy between TFM and MAST was small in first lactation, TFM can still be considered a more suitable trait. First of all, bias should be reduced, as cows that are culled for other reasons than mastitis before they got a chance to express mastitis are no longer considered the same as healthy cows, which is the case when the trait MAST is analyzed. These other culling reasons may not be completely uncorrelated to mastitis and could, therefore, introduce a bias. For example, a cow culled because of another disease 10 d after calving is assigned 0 if the trait MAST is analyzed with LM; thus, it is considered as good as a cow not acquiring mastitis within the whole study period (150 d). If this cow had survived, it might have contracted mastitis on d 11 after calving. When TFM is analyzed with SA, this cow will be censored with the censoring time 20 d.
Furthermore, less information is lost with TFM. This is partly due to the fact that cows with mastitis get different TFM values, and so do cows without cases. Also, when MAST is analyzed in the traditional way and data are cut off (end of data collection), cows with less than, for example, 150 d of lactation and no reported mastitis case are either assigned 0, i.e., healthy, or their observations are treated as missing. With TFM, these observations instead become censored. Moreover, with the trait definitions used in this study, a cow getting mastitis 151 d after calving is assigned 0 and considered healthy when MAST is analyzed, whereas the same cow would be uncensored and treated as diseased with TFM. The fact that TFM, although only considering the first mastitis case, include cases after 150 d and is a more continuously distributed variable, might partly explain its higher heritability.
A feature to point out with TFM is that there will be variation in the accuracy of sire breeding values because of the proportion of uncensored daughters for each sire. Sires with a large proportion of daughters with mastitis will get more accurate breeding values. For first lactation in our data, the percentage of daughters with mastitis (uncensored records) varied between 0 and 34%, with a mean of 15%.
Because the trait time to first mastitis is more continuous and distinguishes between cows within the healthy group, and within the affected group, one might wonder why this trait could not be analyzed with LM instead of with SA. However, the use of TFM is intimately linked to the possibility to handle censoring. Otherwise, culled animals would have to be given a TFM of the culling time, even though they did not have mastitis. Similarly, cows not having mastitis but a successive calving date would get a TFM corresponding to that period. Both of these types of cows would be considered to have actually had contracted mastitis at these times, a feature that is clearly undesirable. In SA, on the other hand, healthy and sick animals are clearly distinguishable.
Correlations Between Predicted Breeding Values
Correlations between EBV obtained with LM and SA were 0.93, 0.89, and 0.88 for lactations 1 to 3, respectively. This result implies that reranking among bulls occurred when the different methods were used. The somewhat lower correlations for later lactations correspond to the decreasing heritabilities and accuracies with increasing parity for both traits. Reranking of sires is illustrated in Table 3
, which lists the top 10 sires based on EBV from LM and their ranking based on EBV from the SA. Depending on the lactation, the top 10 sires differed by 3 or 4 sires across the 2 rankings. A selection scheme involving EBV from LM would thus select some bulls that were significantly lower for mastitis resistance according to the SA.
|
| CONCLUSIONS |
|---|
|
|
|---|
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Received for publication May 5, 2004. Accepted for publication October 12, 2004.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
M. Rodrigues-Motta, D. Gianola, B. Heringstad, G. J. M. Rosa, and Y. M. Chang A Zero-Inflated Poisson Model for Genetic Analysis of the Number of Mastitis Cases in Norwegian Red Cows J Dairy Sci, November 1, 2007; 90(11): 5306 - 5315. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Bijma, W. M. Muir, E. D. Ellen, J. B. Wolf, and J. A. M. Van Arendonk Multilevel Selection 2: Estimating the Genetic Parameters Determining Inheritance and Response to Selection Genetics, January 1, 2007; 175(1): 289 - 299. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Carlen, U. Emanuelson, and E. Strandberg Genetic evaluation of mastitis in dairy cattle using linear models, threshold models, and survival analysis: a simulation study. J Dairy Sci, October 1, 2006; 89(10): 4049 - 4057. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Sewalem, F. Miglior, G. J. Kistemaker, and B. J. Van Doormaal Analysis of the relationship between somatic cell score and functional longevity in canadian dairy cattle. J Dairy Sci, September 1, 2006; 89(9): 3609 - 3614. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |