|
|
||||||||
1 Department of Chemistry, Biotechnology, and Food Science, and
2 Department of Animal and Aquacultural Sciences, Agricultural University of Norway, N-1432 Å s, Norway
Corresponding author: S. Sæbø; e-mail: solve.sabo{at}ikbm.nlh.no.
| ABSTRACT |
|---|
|
|
|---|
Key Words: genetic evaluation clinical mastitis survival analysis Wiener process
Abbreviation key: AIS = adjusted and integrated survival function, CM = clinical mastitis, NRF = Norwegian cattle, S = survival function (probability of no treatment of CM by time T)
| INTRODUCTION |
|---|
|
|
|---|
The time aspect can probably be even better exploited by using survival time models. Survival models are commonly used for analyzing time to disease data, not only in human health surveys but also within the area of animal breeding. The analysis of disease resistance in pigs presented by Henryon et al. (2001) is a recent contribution in this category. So far, survival analysis with complex hierarchical structures has been carried out by means of proportional hazards models in the animal breeding context (e.g., Ducrocq and Casella, 1996; Korsgaard et al., 1998). Our alternative is to adopt a model based on first passage times of stochastic processes, or more precisely, the competing Wiener process model of Sæbø and Almøy (2004a) and Sæbø et al. (2005).
In this model, one assumes that every cow at a given time is in some health state with a certain distance from disease outbreak. The term "distance" should not be interpreted too literally, as it is a measure with no defined biological scale, but the disease resistance of cows can still be compared on this scale. The unobservable physiological fight against infection can be approximated by a stochastic process operating in some kind of health state space. This state space describes the possible health states with regard to the disease in question and is assumed continuous with an absorbing state. An absorbing state is defined as a state from which the stochastic process cannot escape; hence, the process stops if it enters the absorbing state. In our context, we will let the event of process absorption represent CM contraction, and we will use Wiener processes with drift (Chhikara and Folks, 1989) to approximate the changes in a cows health state over time. The time from when the process is initiated until it is absorbed (the first passage time) is a stochastic variable following some known probability distribution, as described later.
The causative background of CM is complex and thus, advantageously, can be modeled by assuming several latent Wiener processes competing in reaching the absorbing state. It is well known from previous studies that the risk of CM is highest in early lactation (Pösö and Mäntysaari, 1996; Heringstad et al., 1999; Nash et al., 2000). The elevated risk in the first days after calving is probably connected to physiological changes known to be initiated a few days before calving (e.g., Kehrli et al., 1989). One theory is that increased levels of certain hormones result in a redistribution of blood cells in the body with increased levels in the bloodstream but reduced levels in other tissues such as the mammary glands (Kulberg et al., 2002). This will result in increased susceptibility to infections and mastitis. These physiological changes put the cow in a position where she is easily infected, i.e., in a health state close to disease.
There are several other risk factors in addition to the suppression of the immune system around calving. Many of these factors are related to the cows environment, such as hygiene and milking technique. Unfavorable levels of environmental factors may be more uniformly distributed over the lactation, resulting in a fairly constant risk of infection and disease.
The 2-fold risk pattern described previously, with a temporary increased risk around calving and a more or less constant risk throughout the lactation, was modeled by assuming 2 latent Wiener processes.
The objectives of this study were to analyze time to first treatment of CM for first-lactation NRF cows using a first passage time model and to examine relevant selection criteria for ranking and selection purposes.
| MATERIALS AND METHODS |
|---|
|
|
|---|
For each cow, the time variable was set to zero at d 31 before first calving. The observed event time was the day of first veterinary treatment of CM. A total of 24.5% of the cows had at least one case of CM during first lactation. Cows with no treatment of CM prior to 31 d before the second calving were right censored. Additionally, there were random right-censored observations caused by culling. Culling rates and CM frequency in the data are described by Heringstad et al. (2003).
Survival Model
In the competing process model (Sæbø and Almøy, 2004a; Sæbø et al., 2005) a Wiener process, W(t), is assumed initiated at time zero, in a "health state" c (with c > 0). The first passage time of the process is the time of the process reaching zero, which is an absorbing state, as described previously. A Wiener process may have drift toward the absorbing state. For a Wiener process with drift, the process increment from time t to t +
t is normally distributed with a mean, depending on the drift. More specifically, it is assumed that W(t +
t) W(t) ~ N(µ
t,
2
t), where µ > 0, is the drift parameter of the process, and
2 is the process variance coefficient. When µ > 0, we refer to the drift as positive (toward the absorbing state 0).
Under these conditions, the first passage time T (the time of process absorption) follows an inverse Gaussian distribution with parameters c and µ (Chhikara and Folks, 1989; Aalen and Gjessing, 2001). For a change in
2, there exist other values of µ and c giving equal distribution for T; hence,
2 is dependent on µ and c. Therefore, for simplicity,
2 was restricted to one in our analyses. The rate by which such processes are absorbed depends on how far from absorption the processes are initiated (value of c) and how "fast" the processes approach zero (value of µ). The risk of absorption increases if c is moved toward zero and/or µ is increased.
Because time of process initiation (T = 0) is unlikely to be equal to the starting point of any health process, the Wiener processes are assumed initiated at some arbitrary time point
. Then, the first passage time T follows a 3-parameter inverse Gaussian distribution given by (Chhikara and Folks, 1989)
![]() | ([1]) |
In the case of negative drift, i.e., µ < 0, the process drifts away from the absorbing state, and the distribution in equation 1 is not a usual probability density, as it will integrate to a value < 1. In a disease context, this would represent a situation where an individual is close to disease (small value of c) at the time of process initiation, but where the risk of absorption decreases as time passes.
The survival function (S) expressing the probability of no absorption (disease) by time t, is given by
![]() | ([2]) |
An exact expression for equation 2 can be found in Aalen and Gjessing (2001). The hazard function, given by h(t|c, µ,
) = f(t|c, µ,
)/S(t|c, µ,
), expresses the instantaneous risk of absorption at time t given that the process has not been absorbed prior to t.
If several risk factors are involved, it is convenient to approximate each factors impact on health by a Wiener process. In general, it can be assumed that J independent latent processes compete in reaching the absorbing state (disease). The observed first passage time ti for an individual i is equal to min(ti1,...,tiJ), i.e., the time of the first process to be absorbed. It can be shown (Sæbø et al., 2005) that the overall hazard at any time point t will consist of the sum of the process-specific hazards of all processes initiated by time t, i.e.,
![]() | ([3]) |
where
= {
1, ...,
J}, with
j = {cj, µj,
j} and I( ) is the indicator function taking the value 1 if its argument is true and 0 otherwise. Correspondingly, the overall survival function will be the product of the process-specific survival functions of all active processes at time t:
![]() | ([4]) |
The overall first passage time distribution is given by the relation
![]() | ([5]) |
The number J of latent processes may be determined by the data as the number maximizing some model-fitting criterion (Sæbø and Almøy, 2004a). Alternatively, the decision can be based on prior knowledge of the phenomenon. In the introduction, we argued for using 2 latent processes for describing the infection pattern of CM, and in the following, we, therefore, use J = 2.
The physiological struggle against disease in the period with elevated risk around calving was approximated by one Wiener process (process 1), initiated at an unknown time, t =
1. A second Wiener process (process 2), independent of process 1, initiated at another unknown time point t =
2, was used to model the constant fight against disease during lactation. Absorption of any of the 2 processes represents a case of CM.
For illustration purposes, the former 2 processes were simulated for 2 cows. Assuming parameter values as estimated in the results section, one obtained the individual health state values of Figure 1
. The first passage time, tij, specific for process j (j = 1, 2) of cow i (i = 1, 2) is the time from process initiation until the health status value reached zero (absorption). For cow 1, the overall first passage time t1 = min(t11, t12) was the time of latent process 2 to be absorbed (t1
27 d). For this cow, process 1 was never absorbed, as it drifted quickly away from absorption after it was initiated at time t =
1. For cow 2, process 2 drifted away from zero, whereas process 1 immediately after initiation was absorbed, giving a first passage time t2
34 d. For the parameter values chosen for this simulation, the overall hazard function is also indicated in Figure 1
, and one notes the dramatic, yet temporary, increase in risk of absorption just after the initiation of process 1, which is initiated close to the absorbing state but with a strong drift away from absorption.
|
![]() | ([6]) |
where
jh is the fixed effect of year of first calving and sjk is the random effect of sire k. The first level of the year effect was restricted to zero; thus, the actual effect of the first year was included in the constant term
j. The 245-element vector of sire effects was assumed to be multivariate normal (MVN). Hence,
j ~MVN(0, A
sj2), where
sj2 is the sire variance of process j and A is the additive genetic relationship matrix. The matrix A was constructed as follows. First, the (302 x 302) matrix à of additive relationships between all 302 sires in the pedigree file was found. From this matrix, the (245 x 245) submatrix A was extracted, corresponding to the 245 sires with daughter records. As commented by Korsgaard et al. (1998), this can be an alternative to using the entire relationship matrix à and the 302-element vector
of sire effects with augmented effects of sires without daughter records. The only cost is the loss of predicted values of sires without records.
All unknown quantities were estimated using Markov chain Monte Carlo (MCMC) methods. This requires construction of a likelihood function based on the observed data. The variable yi was defined to be either the time of first veterinary treatment of CM (ti,m) for infected cow i, or the censoring time (ti,c) if the cow did not contract mastitis. The observed data for cow i is defined as (yi,
i), where
i = I(yi = ti,m) is the censoring indicator, taking the value 1 if its argument is true, and 0 otherwise. A cow with observed CM will contribute f(ti,m|
) to the likelihood (or equivalently h(ti,m|
)S(ti,m|
) as given by equation 5), whereas a censored cow will have a contribution equal to S(ti,c|). The conditional likelihood may be constructed by taking the product over all individuals, conditioning on the sire effects s = {s1, s2}. Using equations 3 and 4, the conditional likelihood is given by
![]() | ([7]) |
A function proportional to the posterior density of
, s, and
= {
2s1,
2s2} was constructed by
![]() | ([8]) |
where the priors p(
) and p(
2 s) express our prior beliefs regarding the components of
and of the sire variance components
Except for the elements of the sire vectors sj (j = 1, 2), all unknown quantities were assumed independent. Hence, the priors p(
) and p(
) in equation 8 can be replaced by the product of the prior densities of all of the independent components of and
Proper, vague priors were assumed for all parameters. For the inverse of
2sj, gamma priors were used (Gamma(0.001, 0.001)), and for cj, log-normal priors were assumed (mean 0 and variance 1000 for the natural logarithm of cj). All other parameters in the model were assumed normally distributed with mean 0 and variance 1000. Various choices for the dispersion of the priors were considered, but it appeared that the likelihood dominates the posterior distribution because of the amount of data for most parameters. Sæbø et al. (2005) found only minor prior sensitivity for the sire variance components.
Because of the complexity of the joint posterior distribution, conditional posterior distributions for single parameters cannot be obtained in closed form. Draws from the conditional posterior distributions were therefore obtained by combining single component and block-wise updating using the Metropolis-Hastings algorithm (Metropolis et al., 1953) at each step.
A total of 110,000 iterations of the Metropolis-Hastings algorithm were run. In each iteration, values of all unknown parameters were in turn drawn from their respective conditional distributions. Based on visual inspection of the trace plots, 10,000 iterations were discarded as burn-in. After burn-in, 100,000 iterations were run, where only every 10th value of each parameter was saved to save computer space. Statistical inference was based on the resulting 10,000 samples.
Sire Evaluation Criteria
The predicted sire effects can be used to identify sires with daughters showing the weakest drift toward disease for each latent process. For sire-ranking purposes, it would, however, be useful to summarize the information from the 2 processes into a single measure. Two such measures based on the predicted sire effects were computed: the probability of no CM by a defined time
and an integrated survival function. Sire ranking based on these 2 measures were compared with corresponding rankings from the multivariate threshold model of Chang et al. (2002) and from the proportional hazards model of Sæbø and Frigessi (2004).
Probability of no CM by time
.
The probability of no CM during first lactation is given by the value of the survival function at the end of the lactation. Here, we define t =
as the lactation end point (for some reasonable
>0). Posterior distributions of these probabilities may be computed by exploiting the output from the MCMC algorithm. At iteration g of the algorithm (g = 1...10,000), a set of sire effects s1,kg and s2,kg for sire k, are drawn from the respective posterior distributions. In addition, values of all other model parameters are also obtained. Let
comprise the sampled values of all parameters relevant for sire k at iteration g, including the sire effects. Sire-specific survival probabilities are computed as
![]() | ([9]) |
for all g. From these values, the posterior density of Sk(
) can be estimated for all sires, and the estimated posterior means may be used as a selection criterion.
Adjusted integrated survival function.
An alternative to the Sk(
) criterion would be to evaluate sires according to a criterion based on the area under the survival curve. It can be shown (Sæbø and Almøy, 2004b) that an adjusted integrated survival function AISk(
) has the following pleasant property:
![]() | ([10]) |
Hence, AISk(
) is equal to the conditional expectation of the inverted hazard function. A large value of AISk(
) means that sire k has a favorable hazard pattern over the time interval [0,
]. The value of AISk(
) may be computed by a simple numerical integration routine (e.g., the rectangle rule) at each step of the MCMC algorithm and estimated posterior means may be used for sire evaluations.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
1 is 30.4, indicating that process 1 is initiated just before calving. The initial health state of this process is
= 1.75, which is close to absorption relative to process 2, for which the estimated initial state is
= 14.6. Because of the small distance to the absorbing state at the time of calving for process 1, the risk of process absorption (disease) is high during this period (Figure 1
|
2 indicates that process 2 is initiated approximately 47 d before calving. Also, for this process, the drift parameter (from equation 6) turns out to be negative for all individuals, but the expected drift away from the absorbing state for this process is much weaker than for process 1. The corresponding process-specific hazard function shows a small increase initially, after which it slowly decreases toward zero (Figure 1For the year effect, the estimates indicate an increase in the risk of CM over the yr 1990 to 1992. This is in accordance with the observed increase in CM frequency during this period (Heringstad et al., 1999). For further discussion with regard to the fit of the competing process model to the mastitis data, we refer to Sæbø and Almøy (2004b) and Sæbø et al. (accepted).
Sire Evaluation
A low value of the sire effect would be preferred for each of the latent processes, as it reduces the drift toward disease, or more correctly, increases the drift away from disease. However, the Pearsons correlation between s1 and s2 was only 0.36, indicating that a top-ranked sire based on process 1 may not necessarily be among the top ranked sires for process 2.
To summarize the information from the 2 processes into a single measure, Sk(331) and AISk(331) were calculated. The top 10 sires for both measures are listed in Table 2
. The Spearmans rank correlations between Sk(331) and AISk(331) was 0.999, and the 2 criteria ranked the same set of sires among top 10 (Table 2
). This means that there is a small degree of crossing between the sire-dependent survival curves. Posterior mean survival curves of the two sires ranking highest and lowest for Sk(331) are given in Figure 2
. Apart from a large difference in the drop of survival probability around calving, the curves show similar shapes. From this it seems that varying risk of CM around calving is the main cause of variation in survival probability at the end of the lactation; hence, the choice of lactation end point
was not crucial for the sire ranking.
|
|
Costs attributable to CM obviously vary over lactation, and the AISk(331) criterion may be improved by adjusting for external factors such as potential losses of milk yield from CM, leading to larger costs early in lactation than later. By including a weight function (t) expressing how the external factor varies with time, the adjusted, integrated, and weighted survival function (AIWSk(
)) can be written:
![]() | ([11]) |
which equals the conditional expectation of the weight hazard ratio over time interval [0,
].
Our competing process model is based on an assumption of independent latent processes, which may not hold. Further work is needed to investigate the effect of this assumption, and perhaps develop the model further to allow for dependent processes. In this first attempt toward genetic evaluation of sires with a competing process model, the number of latent processes was decided a priori. It would be desirable to conduct a further study to examine how the number of processes affects sire evaluation. Finally, there are other fixed and random covariates that need to be accounted for, for example the herd effect. Because herd sizes were small (average of 6.8 cows per herd), the information on the herd effect in the data was limited. Hence, adjustments for herd effects were not carried out in this first analysis with the competing process model.
| CONCLUSION |
|---|
|
|
|---|
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Received for publication January 19, 2004. Accepted for publication September 2, 2004.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
E. Carlen, U. Emanuelson, and E. Strandberg Genetic evaluation of mastitis in dairy cattle using linear models, threshold models, and survival analysis: a simulation study. J Dairy Sci, October 1, 2006; 89(10): 4049 - 4057. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |