|
|
||||||||

* Department of Animal Breeding and Genetics, and
Department of Clinical Sciences, Swedish University of Agricultural Sciences, SE-750 07 Uppsala, Sweden
1 Corresponding author: Emma.Carlen{at}hgen.slu.se
| ABSTRACT |
|---|
|
|
|---|
Key Words: genetic evaluation mastitis simulation survival analysis
| INTRODUCTION |
|---|
|
|
|---|
The approach currently used for national (Denmark, Finland, Norway, and Sweden) and international genetic evaluation of clinical mastitis is to apply a cross-sectional linear model (LM) to an all-or-none trait (Interbull, 2005). The approach is relatively straightforward, but it involves some obvious disadvantages. Mastitis in a defined observation period of lactation is considered as a binary trait, distinguishing between cows with at least one case of mastitis (1) and cows without cases (0). In the Swedish national genetic evaluation, the information used consists of veterinary-treated cases of clinical mastitis from 10 d before to 150 d after calving, or culling because of mastitis within that period. The restricted period was introduced to give all cows the same opportunity period and to reduce bias because of culling occurring in the later part of lactation. This trait definition means that only the first case of mastitis is considered, if it occurs within the defined observation period, and no distinction is made between cows with a case of mastitis early or late in lactation (equally "bad"). By excluding cases other than the first, and by ignoring the timing of the case or the period at risk, some of the available information is not used. In addition, with this methodology, incomplete and ongoing records cannot be treated properly, which might further limit the amount of information used or, even worse, introduce potential bias in the genetic evaluation. Loss of information occurs by treating cows culled before the end of the observation period as missing. However, if these cows instead are included in the analysis by treating them as healthy observations, which is currently being done (e.g., in the Swedish genetic evaluation), bias might be introduced if the reason for culling is correlated with mastitis. Moreover, such observations are not distinguished from cows that did not contract mastitis during the whole period.
Another potential disadvantage of the LM methodology for analysis of binary mastitis data is that the assumption of normally distributed observations is not fulfilled. A nonlinear threshold model (TM), which takes the binary character into account, would be theoretically more appropriate (Gianola, 1982; Mäntysaari et al., 1991). Several studies have used the TM approach for genetic analysis of clinical mastitis data, but the method is not in routine use (Interbull, 2005). Most of the studies have used cross-sectional TM (e.g., Kadarmideen et al., 2000; Heringstad et al., 2001), in which, similar to the cross-sectional LM, only a single response is used for each animal. Thus, the problem with the inefficient use of available information and the risk for introducing bias caused by handling of incomplete and ongoing records is not expected to be solved by using cross-sectional TM instead of cross-sectional LM. However, longitudinal TM (Heringstad et al., 2003; Chang et al., 2004) and multivariate TM (Heringstad et al., 2004) that take multiple cases and time aspects into account and can handle ongoing and incomplete records have recently been applied to mastitis data.
Survival analysis (SA) is a statistical method of studying the occurrence and timing of specific events, in which the analyzed response time equals the time elapsed from a starting point until the occurrence of the event of interest (Ducrocq, 1987). Observations in which a competing event occurs before the event of interest can still be included in the analysis by treating them as censored. Another positive feature of SA is the possibility of including time-dependent covariables to model environmental effects (e.g., stage of lactation) more precisely. Within the field of genetic evaluation of dairy cattle, SA has been successfully used for traits with longitudinal distribution such as longevity traits, for which many countries currently use this method in routine genetic evaluations (Interbull, 2005), and interval fertility traits, such as from calving to last insemination (Schneider et al., 2005).
Similar to longitudinal TM and multivariate TM, some of the disadvantages mentioned that are connected to cross-sectional LM and TM when mastitis is analyzed are expected to be overcome when SA is used. One advantage of SA for mastitis data is that more of the available information is used by including the timing of the case or the length of the opportunity period. There is no need to introduce an arbitrary end to the observation period, such as that of 150 d, to equalize the opportunity period. Another advantage is that cows without cases are treated as censored observations and only the information that these cows did not contract mastitis until the time of censoring is included; after this point, we have no more information. Censoring is a more appropriate way of treating incomplete and ongoing records, and it could reduce the potential bias occurring with the cross-sectional LM and TM when cows culled before they have a chance to express mastitis are treated as healthy observations.
The use of SA to analyze time to first mastitis (TFM) has also been reported (Saebø and Frigessi, 2004; Carlén et al., 2005; Saebø et al., 2005). In a study by Carlén et al. (2005), SA was shown to be an alternative to genetic evaluation of clinical mastitis because TFM in the field data analyzed with SA gave a higher accuracy of predicted breeding values (PBV) than did a binary mastitis trait analyzed with LM. A simulation study in which true breeding values (TBV) can be simulated and thereafter correlated with PBV from different methods would give a complementary indication of the usefulness of SA for genetic evaluation of mastitis.
The main objective of this study was therefore to investigate by simulation whether the trait TFM analyzed with SA would result in a more precise genetic evaluation for mastitis resistance than the more commonly used cross-sectional LM and TM methodologies, in which a binary mastitis trait is analyzed. An additional objective was to study how the length of the observation period for mastitis (the first 150 d of lactation and full lactation, respectively) would affect the results.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Simulation Process
The purpose of the simulation was to create the possible event of a mastitis case within lactation for each cow by simulating sire and cow breeding values for mastitis liability on the underlying scale. The simulation process was a further development of the simulation by Schneider et al. (2005), in which the reproductive cycle of first-parity cows was simulated. The reason for also including the reproductive cycle in our simulation was to achieve the length of the calving interval or the day of culling for fertility reasons. These events were associated with the milk production of the cows, because cows with greater production had more opportunities to become pregnant (i.e., a higher number of inseminations allowed) and were allowed to remain in the herd for longer periods once the decision to cull for poor fertility was made (i.e., if she reached either the maximum waiting period allowed or the maximum number of inseminations allowed without becoming pregnant).
Four traits were simulated: 305-d milk production (kilograms), interval between calving and first ovulation (days), conception liability, and mastitis liability. The phenotypic mean values for milk production and interval between calving and first ovulation were 8,000 kg (SD 1,000) and 28 d (SD 15), respectively. Conception and mastitis were simulated as binary traits with underlying normally distributed liabilities for the respective event, with phenotypic means of zero for both traits and standard deviations of 1 for conception liability and 0.6 for mastitis liability [i.e.,
N(0,1) and
N(0,0.6), respectively]. The latter variation was before the day-to-day variation was added (see ensuing discussion).
Genetic parameters used to simulate breeding values for the parents and Mendelian sampling terms are shown in Table 1
. For milk production and the 2 fertility traits, parameters were identical to those in Schneider et al. (2005). The variances for mastitis liability were chosen to give a heritability estimate in the range normally found from cross-sectional LM (Heringstad et al., 1997; Kadarmideen et al., 2000; Carlén et al., 2004).
|
A small percentage of all cows (1.8%) were culled within d 10 of lactation for calving-related reasons. This early culling was simulated using a Weibull distribution. The rest of the cows were tested for heat after a voluntary waiting period of 8 wk. Cows detected in heat (60% heat detection rate) were inseminated. The first insemination day was allowed to vary between herds, with a mean value of 56 d (SD 3). If conception liability was above the threshold zero (50% conception rate), the cow was considered pregnant and the length of the calving interval was created as the number of days until the last insemination plus a gestation period with a mean value of 280 d (SD 5). The gestation period was only allowed to vary between 265 and 295 d. Cows that did not become pregnant either because they had reached the maximum waiting period or the maximum number of inseminations allowed were culled for fertility reasons. The maximum number of inseminations allowed was connected to herd, with a mean value of 5 inseminations (SD 1), and also to the milk production of the cow relative to her herd mates, with an increase of 1.5 inseminations per 1,000-kg increase in milk. The day of culling was also connected to production of the cow and calculated as a mean value of 240 d (SD 15) plus 20 times the production deviation (in SD) of this cow from the average production.
Analyzed Traits
Phenotypic observations of mastitis were defined both as a binary trait and as TFM within 2 defined observation periods, first 150 d of lactation and full lactation, respectively. The binary trait distinguished between cows with a case of mastitis (1) and cows without cases (0) during the defined observation periods. Cows without mastitis that were culled were included in the analysis as healthy (0). The trait TFM was measured as the number of days from calving to the day of the first mastitis case (uncensored observation). Cows without mastitis were censored at the day of culling, or at lactation d 150 (shorter observation period), or at the day of the subsequent calving (longer observation period). Cows could be culled early in lactation (within 10 d after calving) for calving-related reasons or later on because of infertility. Because cows culled are indicated by the censoring variable when TFM is analyzed, it is not necessary to use a restricted time period to reduce bias.
Statistical Analysis
The binary mastitis trait, observed during the first 150 d of lactation and full lactation, respectively, was analyzed with both LM analysis (LM150, LMFULL) and TM analysis (TM150, TMFULL), whereas TFM, observed during the same periods as a binary trait, was analyzed with SA (Weibull proportional hazards model; SA150, SAFULL). The same effects (mean and random herd and sire effects) were used in all 3 models to allow for a better comparison among them. In addition, a lactation stage effect was included in the Weibull model to account for the early high risk. This was done by including 2 stages of lactation, before and after 10 d after calving, as a time-dependent effect using the statement "timecov" in Survival Kit. The mean in the Weibull model corresponds to an average hazard over time, defined as
0(t), the Weibull baseline hazard function [
(
t)
1] with a positive scale parameter (
) and a positive shape parameter (
). A value of
< 1 indicates that the hazard decreases with time, whereas
> 1 means that the hazard increases with time. The herd effect was assumed to be normally distributed in the LM and the TM analyses, whereas in SA it was assumed to follow a log-gamma distribution and was integrated out from the joint posterior density. The sire effect was assumed to be normally distributed for all 3 models. Sires were assumed to be unrelated. Another Weibull model was also applied for SA150 and SAFULL in which the effect of lactation stage was excluded.
To obtain REML estimates of the variance components and breeding value predictions, the DMU package (Madsen and Jensen, 2000) and ASReml (Gilmour et al., 2002) were used for the LM and TM, respectively. The method for the TM implemented in ASReml was essentially that of Gianola and Foulley (1983) and Harville and Mee (1984), and the binominal distribution with a probit link function was used. The heritability was calculated as
![]() | [1] |
For the TM,
was set to 1. For a transformation of the heritability estimate from the LM to the underlying liability scale, the following formula (Dempster and Lerner, 1950) was used
![]() | [2] |
where p is the incidence of mastitis and i is the mean liability of affected individuals.
Survival Kit V3.12 (Ducrocq and Sölkner, 1998) was used to estimate the variance components for sire and herd and to predict breeding values for SA. The heritability was calculated as
![]() | [3] |
where c is the proportion of censored records.
Yazdi et al. (2002) suggested that this derivation for heritability on the original scale (Equation 3), which is not dependent on the Weibull parameters, be called the equivalent heritability. They showed very good agreement between accuracy and selection response calculated using
equ2 and observed accuracies and responses calculated from their simulation. The term "equivalent" refers to the fact that the PBV of a sire with n daughters would have the same reliability as if it were evaluated on a linear trait with this heritability. An increase in the proportion of uncensored records with time implies that the equivalent heritability increases with time until it reaches the effective heritability, [
2 = 4
s2/(
s2 + 1)], the heritability one would obtain in the total absence of censoring.
Comparison of Methods
The main approach for comparison of the 6 analyses (3 different methods and 2 different observation periods) was to calculate Pearson product-moment correlations (SAS Institute, 1999) between the sire PBV from each analysis (LM150, LMFULL, TM150, TMFULL, SA150, SAFULL), respectively, and the sire TBV for mastitis liability. In addition, we compared the average true genetic merit for the best and worst 10% of bulls ranked on PBV from the different analyses and also the proportion of the best or worst 10% of bulls based on TBV that were correctly identified to be in the best or worst 10% based on PBV from the different analyses. Further, the theoretical accuracy (r) in selection was calculated for the different analyses according to
![]() | [4] |
where n corresponds to the average number of daughters per sire (n = 150 or n = 60) and
![]() |
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
), which is discussed.
Descriptive Statistics
When observed during the first 150 d of lactation, the average incidence of the binary mastitis trait was 0.107 (SD 0.003, range 0.100 to 0.114). This figure is in good agreement with a recent estimate of the incidence of mastitis in field data of Swedish first-parity cows in which mastitis was defined in a similar way (Carlén et al., 2005). For SA150, the average proportion of uncensored records (i.e., cows with mastitis within 150 d) was consequently also 10.7%. The average failure time (i.e., the number of days until the first case of mastitis) was 53 and the average censoring time was 147.
When observed during the full lactation, the average incidence of the binary mastitis trait increased to 0.167 (SD 0.005, range 0.157 to 0.177), which corresponded to the average proportion of 16.7% uncensored records during full lactation in the SA. Here the average failure time was 124 d and the average censoring time was 347 d. The corresponding numbers from the study by Carlén et al. (2005), in which SA was used to analyze TFM during full lactation in field data, were 15% uncensored records, 123 d for average failure time, and 364 d for average censoring time. The distribution of the cows first mastitis cases over lactation from one replicate of TFM observed during full lactation can be seen in Figure 1
.
|
) for SA150 and SAFULL were, on average, 0.91 and 0.77, respectively. These results indicate a decreasing risk of contracting mastitis with time within lactation, and it reflects the higher risk simulated for the first 10 d of lactation. The corresponding values of
without a fitted lactation stage effect in the model were, on average, 0.63 and 0.61, respectively. The reason
moves closer to 1 with a fitted lactation stage effect in the model is that the higher risk of contracting mastitis at the beginning of lactation is accounted for in the lactation stage effect instead of in the baseline hazard. A high risk at the beginning of the lactation, followed by a low and nearly constant risk in the remaining part of the lactation, has been reported in the literature (Barkema et al., 1998; Heringstad et al., 1999; Carlén et al., 2004). In field data of TFM during full lactations without a fitted lactation stage effect in the model,
was estimated at 0.6 in the first lactation and 0.7 for the second and third lactations (Carlén et al., 2005).
Heritabilities and variance components estimated from the different analyses are presented in Table 2
. The greater heritability estimates obtained for TM150 and TMFULL (7.4 and 8.2%, respectively) than for LM150 and LMFULL (2.7 and 3.6%, respectively) were expected because heritability is greater on the underlying liability scale than on the observed scale. Heringstad et al. (1997) compared the 2 methods of analyzing mastitis data and found approximately 2 times greater heritability estimates from the TM than from the LM (0.09 compared with 0.05). Kadarmideen et al. (2000) estimated heritabilities for clinical mastitis of 0.126 and 0.038 when using TM and LM, respectively. Transformed to the underlying liability scale, the heritability estimates from the LM150 and LMFULL (7.5 and 7.9%, respectively) were close to the estimates from the TM.
|
|
The theoretical accuracies calculated for LM150 and LMFULL (r = 0.71 and r = 0.76, respectively) and for SA150 and SAFULL (r = 0.71 and r = 0.76, respectively) were in very good agreement with the corresponding correlations between TBV and PBV (Table 3
, row 1), which can be seen as the true accuracy. This result was evidence that the simulation worked satisfactorily, and for SA it verified the usefulness of the defined equivalent heritability. When heritability was defined on the underlying liability scale (i.e., for TM150 and TMFULL) and for the heritability estimates from LM transformed to the underlying liability scale, the theoretical accuracy and the corresponding correlations differed, with the theoretical accuracy being markedly higher (r = 0.86 to 0.87). Our results confirm the discussion in Boettcher et al. (1999) that although the heritability is greater when defined on the underlying liability scale, selection based on PBV from the TM may not yield higher genetic progress on the observed scale than selection on PBV from the LM. Foulley (1992) demonstrated that heritability on the observed scale should be used in calculating accuracy from a TM. Results from Kadarmideen et al. (2000) also suggest that accuracy of selection would be lower with the TM than the LM for the same heritability, based on the standard errors for the estimated heritabilities with the respective methods.
The fact that the calculated correlations between TBV and PBV for the LM and SA were very similar to the corresponding theoretical accuracies calculated using the number of daughters and the estimated heritabilities also verified that the heritabilities were estimated correctly. This is otherwise difficult to ascertain by comparing the estimated heritabilities with the simulated heritability for mastitis liability. For the SA, we observed another trait (TFM) and modeled the hazard using an exponential model. For the binary trait analyzed using LM or TM, one might consider that it would be possible to compare the heritability on the underlying liability scale with the simulated heritability. This would have been possible if we had drawn only one sample from a normal distribution and assigned a 0 or 1 depending on whether it was below or above the threshold. When using the distribution including both permanent and day-to-day variation, this would correspond to observing whether the cow had mastitis on a random lactation day. The underlying heritability would then have been expected to be 0.036 (with herd variance included in the denominator; Table 1
). However, we now draw samples until culling or next calving occurs (a sequence of zeros) or until the value exceeds the threshold (a sequence of zeros followed by a 1). The exact true heritability of these observations is not easily calculated. However, for our purposes it is also unnecessary, because we have shown that the estimated heritabilities were appropriate for use in calculating the theoretical accuracy of the PBV.
Correlations between TBV for mastitis liability and PBV from the different analyses using an average daughter group size of 60 are also shown in Table 3
. The smaller daughter group size did not change the conclusion of the comparison of methods, but the general level of the correlations was lower compared with when an average daughter group size of 150 was used.
The average TBV for mastitis liability of the best and worst 10% of bulls ranked on PBV from the different analyses are presented in Table 4
. The best bulls ranked using full lactation data had an average true genetic merit that was lower than the best bulls ranked using data from the first 150 d, regardless of the method. Lower genetic merit implies less mastitis, thus better mastitis resistance. When comparing methods within an observation period, there was only a slight advantage of TM or SA over LM. The proportion of the best or worst 10% of bulls based on TBV that were correctly identified to be in the best or worst 10% based on PBV from the different analyses is given in Table 5
. Again, the same trend is seen, and a greater proportion was correctly ranked when the full lactation data were used.
|
|
Results from all 3 approaches for comparison of analyses were in agreement and indicated that, with the given trait definitions and data structure, little was gained by replacing the LM with the TM or SA, even though the latter 2 methods had theoretical advantages. However, given the simulated conditions, increasing the observation period substantially increased accuracy. With the variable observation period, the opportunity periods for cows differed, and of the tested methods, only the SA could theoretically account for that.
In the simulation, culling was not directly related to mastitis liability. In real data, cows may be culled because of high SCS without being treated for mastitis. These cows would be considered as healthy (0) in the LM or TM, and be censored in the SA. Should this situation lead to bias, the SA would be less affected.
In a previous study in which field data were used to compare the SA and LM for genetic evaluation of clinical mastitis (Carlén et al., 2005), the authors concluded that the accuracy of selection was about 3 and 25% higher for first and later lactations, respectively, for TFM analyzed with the SA than for a binary mastitis trait analyzed with the LM. These results in the field study could partly be explained by the differences in trait definitions, because TFM is a more continuously distributed trait and includes cases after d 150 in lactation. Another contributing reason could be the culling for high SCS occurring in the field data. When using SA, culled cows are treated properly, which reduces potential bias. This might explain the greater benefit of SA in later lactations, in which more culling based on mastitis cases in previous lactations or high SCS can be expected. To study this aspect further, it would be interesting to extend the simulation to include the effect of culling because of high SCS.
In our study we simulated TFM from repeated draws from an underlying liability. Another option would be to simulate TFM directly from a Weibull distribution. It is possible that this latter approach would have favored the SA in comparison with the LM and TM, a possibility that could be studied further.
One disadvantage with TFM analyzed with the SA is that, similar to the binary mastitis trait, only the first case of mastitis within a lactation is considered. Another disadvantage with the SA is that multitrait analysis is not easy, and for an efficient genetic evaluation of clinical mastitis data, a multitrait analysis together with SCC would be desirable. However, there are ways to avoid this problem. In the French national genetic evaluation, breeding values for direct longevity are predicted with SA, and thereafter, combined longevity, based on direct longevity and several other traits, is computed using multitrait analysis (Interbull, 2005). Recent work has also shown that it is feasible to analyze a survival trait together with a normally distributed continuous trait or a threshold trait using a Bayesian approach and applying Gibbs sampling (Damgaard, 2005).
| CONCLUSIONS |
|---|
|
|
|---|
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Received for publication February 10, 2006. Accepted for publication May 3, 2006.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. E. Vallimont, C. D. Dechow, C. G. Sattler, and J. S. Clay Heritability estimates associated with alternative definitions of mastitis and correlations with somatic cell score and yield J Dairy Sci, July 1, 2009; 92(7): 3402 - 3410. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. V. Laursen, D. Boelling, and T. Mark Genetic parameters for claw and leg health, foot and leg conformation, and locomotion in Danish Holsteins J Dairy Sci, April 1, 2009; 92(4): 1770 - 1777. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Ouweltjes, J. J. Windig, G. de Jong, T. J. G. M. Lam, J. ten Napel, and Y. de Haas The Use of Data from Sampling for Bacteriology for Genetic Selection Against Clinical Mastitis J Dairy Sci, December 1, 2008; 91(12): 4860 - 4870. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. R. Wiggans, S. Tsuruta, and I. Misztal Technical Note: Adaptation of an Animal-Model Method for Approximation of Reliabilities to a Sire-Maternal Grandsire Model J Dairy Sci, October 1, 2008; 91(10): 4058 - 4061. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. D. R. N. Appuhamy, B. G. Cassell, C. D. Dechow, and J. B. Cole Phenotypic Relationships of Common Health Disorders in Dairy Cows to Lactation Persistency Estimated from Daily Milk Weights J Dairy Sci, September 1, 2007; 90(9): 4424 - 4434. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Silvestre, M. M. D. Ginja, A. J. A. Ferreira, and J. Colaco Comparison of estimates of hip dysplasia genetic parameters in Estrela Mountain Dog using linear and threshold models J Anim Sci, August 1, 2007; 85(8): 1880 - 1884. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |