JDS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J. Dairy Sci. 2007. 90:5306-5315. doi:10.3168/jds.2006-898
© 2007 American Dairy Science Association ®

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Rodrigues-Motta, M.
Right arrow Articles by Chang, Y. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Rodrigues-Motta, M.
Right arrow Articles by Chang, Y. M.

A Zero-Inflated Poisson Model for Genetic Analysis of the Number of Mastitis Cases in Norwegian Red Cows

M. Rodrigues-Motta*, D. Gianola*,{dagger},{ddagger}, B. Heringstad{ddagger}, G. J. M. Rosa{dagger} and Y. M. Chang{dagger}

* Department of Animal Sciences and
{dagger} Department of Dairy Science, University of Wisconsin, Madison 53706
{ddagger} Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, PO Box 5003, N-1432 Ås, Norway

1 Corresponding author: motta{at}calshp.cals.wisc.edu


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
The objective was to extend a zero-inflated Poisson (ZIP) model to account for correlated genetic effects, and to use this model to analyze the number of clinical mastitis cases in Norwegian Red cows. The ZIP model is suitable for analysis of count data containing an excess of zeros relative to what is expected from Poisson sampling. A ZIP model was developed and compared with a corresponding Poisson model. The Poisson parameter followed a hierarchical structure, and a residual term accounting for overdispersion was included. In both models, the Poisson parameter was regressed 1) on the year, month, and age at first calving; 2) on the logarithm of the number of days elapsed from calving to the end of first lactation; and c) on herd and sire effects. Herd and sire effects were assigned normal prior distributions in a Bayesian analysis, corresponding to a random effects treatment in a frequentist analysis. An analysis of residuals favored the Poisson model when there were 2 or more cases of mastitis during first lactation, with very small differences between the ZIP and Poisson models at 0 and 1 cases. However, the residual assessment was not satisfactory for either of the models. The ZIP model, on the other hand, had a better predictive ability than the corresponding Poisson model. Posterior means of the sire, herd, and residual variances in the ZIP model (log scale) were 0.09, 0.37, and 0.36, respectively, highlighting the importance of herds as a source of variation in clinical mastitis. The correlation between sire rankings from the ZIP and Poisson models was 0.98. A weaker correlation would be expected in a population with more severe inflation at zero than the present one. The estimate of the perfect state probability p was 0.32, indicating that 32% of the animals would be in the perfect state, either because they are resistant or because they were not exposed to mastitis.

Key Words: Bayesian method • mastitis • zero-inflated data


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
In quantitative genetic analysis, mastitis is often scored as "absent" or "present" and analyzed with linear (e.g., Carlén et al., 2005) or threshold models (e.g., Heringstad et al., 2003a; Chang et al., 2004), with the latter taking the binary nature of the data into account. Cows may have more than one case of mastitis during lactation, and longitudinal binary response models have been used as well (Heringstad et al., 2003b). An alternative form of analysis is one based on counting the numbers of episodes of the disease within a given time period. A candidate sampling model for studying variation in the number of clinical mastitis cases during a lactation is provided by the Poisson distribution (Schukken et al., 1990; Barnouin et al., 2005). Unfortunately, this theoretical distribution is seldom suitable because it introduces a constraint on the variance to mean ratio. The presence of overdispersion leads to incorrect inferences and to inappropriate statistical tests (Gasqui et al., 2000). Variation attributable to Poisson sampling often accounts for only a proportion of the total variation observed (Breslow, 1984; Brillinger, 1986). One way of dealing with overdispersion is to fit a mixed-effects model, such as the pure Poisson mixed model proposed by Foulley et al. (1987), or extend this model to account for "extra-Poisson" variation by introducing a residual term in the structure (Tempelman and Gianola, 1996).

In some situations, overdispersion may be due to an excessive number of zero counts (Lambert, 1992). In such a case, a zero-inflated Poisson (ZIP) model would be suitable. In a ZIP model, it is assumed that, with probability p, the only possible observation is 0 and, with probability 1 – p, that a Poisson random variable with parameter {lambda} is observed. Here, p is the probability of the "perfect" state (i.e., an animal is either resistant or not exposed to the disease), and {lambda} is the mean number of mastitis cases in the "imperfect" case (i.e., an animal is liable to the disease).

The main objective of this study was to extend the ZIP model to accommodate correlated genetic effects, and to use this model for genetic analysis of the number of mastitis cases. As an illustration, an application was made to data containing information on the number of cases of clinical mastitis in first-lactation Norwegian Red cows from a previously analyzed Norwegian data set from the early 1990s. Further, the ZIP model was compared with a corresponding Poisson model via a residual and predictive ability analysis.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Data
The data used to implement and test the model consisted of clinical mastitis information on 36,178 first-lactation Norwegian Red cows. Cows in this pilot data set were daughters of 245 sires that had their first progeny test in 1991 and 1992. This data set (same cows) has been used in several other studies (e.g., Chang et al., 2004, 2006). Although it was not the purpose of this study, an advantage of using results from the same cows is that they can easily be compared with results from other studies that used different methods.

Only records of daughters with first calving in 1990 through 1992, and from herds having at least 5 daughters of any of the test bulls, were used. For each cow, the number of cases of veterinary-treated clinical mastitis (NCM) in first lactation, from 30 d before calving to either 300 d after calving or culling, whichever occurred first, was counted. Cows having a second calving before 300 d after first calving were excluded from the analysis. The interval between treatments had to be at least 5 d, to avoid counting repeated treatments of the same case as distinct mastitis episodes. About 76% of the cows did not have a case of clinical mastitis during first lactation, whereas 15.8, 5.1, and 1.6% had 1, 2, and 3 cases, respectively. Only 315 cows had more than 3 episodes of clinical mastitis during first lactation.

Discriminating between simple Poisson and ZIP models can be done by testing the null hypothesis, H0: p = 0. Asymptotic tests proposed by Rao and Chakravarti (1956) and van den Broek (1995) were used as part of an explanatory analysis. The Rao and Chakravarti criterion was calculated as


Formula

where n0 = 27,745 is the number of zeros in the data; n = 36,178 is the total number of observations; and y = 0.34 is the mean number of mastitis cases in the data set. The U statistic is asymptotically distributed as an N(0, 1) variable; the realized value of U = 62.59 leads to rejection of H0. The score test is given by


Formula

where Formula = ey = e–0.34 is the estimate of the probability of having no mastitis cases. The realized value of S = 3917.3 was referred to a {chi}2 distribution with one degree of freedom, leading to rejection of the hypothesis that p is 0. The tests suggest overdispersion in the number of mastitis cases relative to the homogeneous Poisson sampling, and part of this may be caused by the observed excess of zeros. The tests are suggestive, but not conclusive, because they are based on the assumption of a homogeneous population.

An excess of zeros by level of year, month, and age at first calving was also investigated. For each level of these 3 potential explanatory variables, the observed proportion of zeros was compared against a Poisson distribution with parameter equal to the mean number of mastitis cases in the respective level, as shown in Table 1Go. An excess of zeros was found for all levels. However, there were some herds and sires without extra zeros according to this criterion. For the sire with the largest frequency of daughters with mastitis, the observed and expected proportions of zeros were 0.64 and 0.84, respectively. A sire pedigree file, with a total of 437 males, was built by tracing the pedigree of the 245 sires with daughters in the data set, through sires and maternal grandsires, as far back as possible.


View this table:
[in this window]
[in a new window]

 
Table 1. Observed and expected (under a homogeneous1 Poisson distribution) proportion of zeros for each level of calving year, calving month, and age at calving
 
Statistical Model
A ZIP model was developed for analysis of NCM. Let y = (y1, . . . , yn)' be a vector of NCM records on cows, such that yi = 0 with probability p, and yi ~ Poisson({lambda}i) with probability 1 – p, with n = 36,178. Note that if p is equal to zero, the setting yields a Poisson distribution with mean {lambda}i. The probability of a cow being healthy is


Formula

and the probability of a cow having k cases of mastitis is


Formula

Note that p was assumed to be homogeneous, meaning that it is the same for all animals. Let {lambda} = ({lambda}1, . . ., {lambda}n)'. The conditional (given the fixed and random effects) log-likelihood function for the homogeneous p ZIP model is given by


Formula

where l({lambda}, p | ß, h, s, y) is the conditional likelihood function for the homogeneous p ZIP model. Further, {lambda} and p satisfied the models:


Formula

where ß is a vector of effects of year of first calving (3 levels), month of first calving (12 levels), age at first calving (15 levels; Table 1Go), and regression on the log of the number of days elapsed from calving to the end of first lactation (culling or d 300); h is a vector of herd effects of order 5,286; s is a vector of sire transmitting abilities of order 437; and X, W, and Z are the corresponding incidence matrices. Further, {varepsilon} is a vector of residuals, which was assumed to follow the multivariate normal distribution N(0,In{sigma}{varepsilon}2), where {sigma}{varepsilon}2 is the residual variance representing all unaccounted for sources of variation. Part of this unaccounted for variation is because a sire model is used here, so three-fourths of the additive genetic variance was not modeled explicitly (Damgaard et al., 2006). The distributions of herd and sire effects were assumed to be independent multivariate normal with mean zero and covariance matrices I5,286{sigma}h2and A{sigma}s2, respectively, where A is a known matrix of additive genetic relationships between sires, and {sigma}h2 and {sigma}s2 are variances of herd and sire effects, respectively, on a logarithm scale. The ZIP model was compared with a Poisson model with the same structure for the Poisson parameter. Inferences about unknown parameters were based on the Bayesian approach, via a Markov chain Monte Carlo (MCMC) sampling algorithm.

Prior Distributions.
Models were developed hierarchically, with all parameters regarded as random variables following some prior distribution. The prior distribution adopted for {lambda}* was a normal distribution with appropriate mean and variance {sigma}{varepsilon}2. Independent bounded uniform priors were assigned to p* and to each of the elements of ß, that is, p* ~ U(–5, 5] and ß ~ U(–50, 50]. As stated above, herd and sire effects were assigned independent normal prior distributions with mean zero and variances {sigma}h2 and {sigma}s2, respectively. Scale-inverse {chi}2 prior distributions were used for the variances of herd, sire, and residual effects, that is, {sigma}h2, ~ {upsilon}h{delta}h{chi}{upsilon}h–2, {sigma}s2 ~ {upsilon}s{delta}s{chi}{upsilon}h–2 and {sigma}{varepsilon}2 ~ {upsilon}{varepsilon}{delta}{varepsilon}{chi}{upsilon}{varepsilon}, respectively, where the degrees of freedom parameters were set to {upsilon}h = {upsilon}s = {upsilon}{varepsilon} = 5 and the scale parameters were assigned the values {delta}h = {delta}s = {delta}{varepsilon} = 1. These values were chosen to achieve vague proper priors for the variance components.

Fully Conditional Posterior Distributions and Sampling Procedure.
The fully conditional distributions of individual parameters were deduced by taking all other parameters as fixed, and absorbing them into the integration constant of the conditional posterior distribution of interest. The fully conditional densities of {sigma}{varepsilon}2, {sigma}h2, {sigma}s2 are scale-inverse {chi}2 with appropriate parameters, and the joint fully conditional density of ß, h, and s is a multivariate normal distribution with appropriate mean and (co)variance matrix (Wang et al., 1993, 1994).

The fully conditional density of each element of {lambda}* and the density of p* do not have a closed form, and are given by


Formula

and


Formula

respectively, where I(p*; –5, 5) means that p* isin (–5, 5). Here, I(.) is the indicator function, which is equal to 1 if the condition is satisfied and 0 otherwise; X'i, Wi, and Zi respectively.

A Gibbs-Metropolis algorithm was developed, based on sampling iteratively from all fully conditional distributions. Location effects, ß, h, and s were sampled from a multivariate normal distribution with appropriate mean and (co)variance matrix, and the 3 variance components were drawn from scale-inverted {chi}2 distributions (Wang et al., 1993, 1994). A Metropolis algorithm was used for all other parameters. A normal distribution with mean equal to {lambda}i*(t) at iteration t and an appropriate variance was used as proposal distribution to sample {lambda}i*(t+1). Similarly, a normal distribution with mean equal to p*(t) at iteration t and an appropriate variance was used as proposal distribution to sample p*(t+1). The variances of the proposal distributions were chosen to satisfy acceptance rates between 30 and 50% (Gelman et al., 2002), which restrict the acceptance rate to being neither too high nor too low, reducing the chance of a chain getting stuck or moving too slowly in the parameter space.

Convergence Diagnostics.
Visual inspection of trace plots of the MCMC run and diagnostics suggested by Gelman and Rubin (1992) were used to determine the length of the burn-in period and the total number of iterations for the Gibbs-Metropolis procedure. Two chains with overdispersed starting points were used in the method of Gelman and Rubin (1992), which is based on monitoring convergence of the iterative simulation by estimating the factor by which the scale of the current distribution for a parameter under study, say {tau}, might be reduced if simulations were continued for an infinite amount of time. The potential scale reduction is given by


Formula

where


Formula

Here, W and B are estimates of the within- and between-chain variances, and J is the number of iterations. R declines to 1 as the number of iterations goes to infinity. There is an indication of convergence when R is close to 1.

Two chains starting from overdispersed values were used, with a total of 500,000 samples, including a burn-in period of 250,000 iterations. A visual inspection of the trace plots (results not shown) suggested that the 2 chains had converged after 250,000 iterations. For the Gelman and Rubin (1992) convergence criterion, the scale reduction factors (R) for the residual, herd, and sire variances were 0.9999996, 1.0001, and 1.000001, respectively; the scale reduction R for p was 1.00001. On this basis, we decided to use only one chain with a burn-in period of 250,000 iterations, and with 250,000 samples collected thereafter for calculating Monte Carlo estimates of posterior features. No thinning of samples was practiced.

Residual Analysis
In the residual analysis, a residual for the record of cow i under a given model was defined as ri = yi – 10 – E(yi|{tau}). The posterior mean of the standardized residual was estimated as


Formula

with i = 1, . . . , n; J is equal to the number of samples from the posterior distribution, and the expectation and variance correspond to those under model M with parameter vector {tau}. An observation would be unusual if the posterior distribution of ri were concentrated away from zero. Under the Poisson model, E(yi|{tau}(j)) = var(yi|{tau}(j)) = {lambda}i. Under the ZIP model, E(yi|{tau}(j)) = (1 – p){lambda}i and var(yi|{tau}(j)) = E(yi|{tau}(j))[1 + {lambda}ip], where {lambda}i = exp({lambda}I*) and p = exp(p*)/[1+ exp(p*)].

Posterior Predictive Assessment
The adequacy of a given statistical model may be assessed by referring the observed value of some statistic to its sampling distribution expected under this model (Sorensen and Waagepetersen, 2003). If the observed value is atypical, this would be interpreted as evidence against the model. One way of checking whether a model describes the observed data well is to adopt a measure of discrepancy, say T(.), between the observed data and predictions from the model posited (Gelman and Rubin., 1992).

Let yrep be "replicated" or simulated data from model M with parameter vector {tau}, and y be the observed data vector. In the posterior predictive approach to model validation, one can work with T(y|{tau}) – T(yrep|{tau}) and check whether zero is an extreme value in the posterior predictive distribution of T(y|{tau}) – T(yrep|{tau}). Because the parameter vector {tau} is unknown, its values are sampled from the posterior distribution. Likewise, yrep is obtained by sampling from the posterior predictive density


Formula

Here, p(yrep | y, {tau}, M) = p(yrep | {tau}, M), because yrep (the new sample, given the parameters) is independent of y (the observed data, given the parameters).

Two different discrepancy measures were used, with each of these evaluated at the observed and replicated data. The first measure was based on a sum, across observations, of standardized residuals:


Formula

with the expectation and variance corresponding to model M. Using the samples from the MCMC procedure, we constructed a scatter plot to measure the discrepancy betweenT1(y, {tau}(l)) and T1(yrep(l), {tau}(l)),where l = 1, . . ., 10,000 is the number of simulated samples. If the model fits, points are expected to fall in a circle centered at (0, 0).

The second measure of discrepancy used was based on:


Formula

k = 0, 1, 2, . . ., which gives the proportion of cows with k cases of mastitis in a sample of size n. We examined how Dk(l) = T2,k(y) – T2,k(yrep(l)) varied over k, with l = 1, . . . , 10,000 simulated samples, each of size n. Here, T2,k(y) represents the proportion of cows with k cases of mastitis in the observed sample, and T2,k(yrep(l)) represents the proportion of cows with k cases of mastitis in the simulated sample under {tau} = {tau}(l).

Computations were as follows: 1) each element y rep,i (l) of the vector yrep(l) = (yrep(l),1,...,yrep,n(l)) was drawn from a ZIP distribution evaluated at the posterior sample of {lambda}i* and p* at iteration l; 2) the statistics T1(yrep(l), {tau} (l)) and T2,k(yrep(l)) were evaluated for each realization of yrep(l). This yielded 10,000 "replicated" standardized residuals and relative frequencies of the distribution of mastitis.

Sire Evaluation
The posterior mean of the probability of producing daughters without any case of mastitis in first lactation is appealing for the genetic evaluation and ranking of sires for selection. Sire transmitting abilities in the logarithmic scale for the Poisson parameters do not lead directly to the probability of sires having daughters without mastitis infection, because the Poisson parameter depends on other effects in the model. Let


Formula

be the expected Poisson parameter of sire j, with µ being some combination of levels of the ß vector, and hbeing the average of the posterior means of the herd effects. In addition, let Pj be the probability that sire j produces a daughter without mastitis. Then


Formula

where {lambda}j = exp({lambda}j*) and p = exp(p*)/[1+ exp(p*)]. Sire ranking could be based on the posterior mean of Pj. However, note that Pj decreases monotonically as sj increases. Hence, in a model with homogenous p, sire ranking based on posterior means of Pj and sj are identical, given p.

Correlations between sire evaluations under Poisson and ZIP models were computed by using the posterior mean of the probability of no mastitis for each of the sires as the criterion. The ß-effects were combined by setting calving year to 1992, calving month to August, age at calving to 24 mo, and number of days from first calving to the defined end of the lactation as equal to 300.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Residual plots for the ZIP and Poisson models are shown in Figure 1Go. The plots indicated that the residuals were larger under the ZIP model than under the Poisson model; however, this difference was very small. The distribution of posterior means of residuals associated with cows having zero or one mastitis case were nearly the same in the 2 models. For counts larger than one, residual posterior means under the ZIP model were slightly larger. Neither model had a completely satisfactory performance in the analysis of residuals.


Figure 1
View larger version (11K):
[in this window]
[in a new window]

 
Figure 1. Posterior means of the standardized residuals under the zero-inflated Poisson (ZIP) and Poisson models.

 
Figure 2Go shows a scatter plot of the statistics T1(y(l),{tau}(l)) (abscissa) and T1(yrep(l){tau}(l)) (ordinate) for the ZIP and Poisson models. Both graphs show that realizations for the observed-data statistic cluster around the coordinate (0, 0), indicating a good agreement between observed data and predictions, at least with respect to T1 (., {tau}(l)).


Figure 2
View larger version (17K):
[in this window]
[in a new window]

 
Figure 2. Posterior realizations of the discrepancy statistic T1 (.,{tau} (l)) (observed data in abscissa and replicated data in the ordinates) under the zero-inflated Poisson (ZIP) and Poisson models.

 
The discrepancy statistic Dk(l) was used for posterior predictive checking of the ZIP and Poisson models, at each of k = 0, 1, 2, 3, and 4 or more cases of clinical mastitis per cow. The posterior predictive distributions of Dk(l) under the ZIP and Poisson models are displayed in Figure 3Go. The plots suggest that the ZIP model was closer than the Poisson model to the observed proportion of mastitis cases across all number of cases, because the mean values of the discrepancy statistic were closer to zero for ZIP. In particular, the proportions of 0, 1, and 2 cases of mastitis were better predicted by ZIP than by the Poisson model.


Figure 3
View larger version (9K):
[in this window]
[in a new window]

 
Figure 3. Posterior realizations of the discrepancy statistic function Dk(l) under the zero-inflated Poisson (ZIP) and Poisson models.

 
Posterior means (standard deviation) of the residual, herd, and sire variances were 0.36 (0.04), 0.37 (0.02), and 0.09 (0.01), respectively, for the ZIP model and 0.90 (0.03), 0.35 (0.02), and 0.05 (0.01) for the Poisson model. In the ZIP model, the most important source of variation was herd, followed by the residual and sire variances. The herd and sire variances were similar in the ZIP and Poisson models, but the residual variance in the Poisson model was almost 3 times greater than that for the ZIP model, indicating that the residuals in the Poisson model were inflated by the excess of zeros.

In the ZIP model, the herd and sire variances accounted for 37 and 9%, respectively, of the variability of log-Poisson parameters, indicating the relevance of these effects. Posterior distributions of the variance components and of the perfect state probability, p, under the ZIP model are given in Figure 4Go. The posterior distributions of the residual and herd variances, as well as of the perfect state probability were nearly symmetric. The posterior distribution of the sire variance had a slightly longer tail to the right. This was expected, because the posterior mean of {sigma}s2 was closer to 0 than those of the residual and herd components of variance. There is no obviously useful definition of heritability in this model; hence, posterior distributions of ratios of the variance components were not assessed. However, the between-sires variance accounted for 10% of the variation in {lambda}j* parameters, indicating genetic variation for mastitis.


Figure 4
View larger version (16K):
[in this window]
[in a new window]

 
Figure 4. Posterior distributions of the residual, herd and sire variances, and of the perfect state probability p, under the zero-inflated Poisson (ZIP) model.

 
The posterior distribution of the perfect state probability (Figure 4Go) was nearly symmetric, and its mean (standard deviation) was 0.32 (0.02). This means that, on average, 32% of first-lactation Norwegian Red cows would be expected to be in the perfect state and not get mastitis at all, either because of complete resistance to the disease, or because they are not exposed to it.

The Monte Carlo variances of residual, herd, and sire variances and of the perfect state probability were 3.3 x 10–8, 7.5 x 10–9, 8.6 x 10–8, and 6.1 x 10–9, respectively, indicating small Monte Carlo errors.

Figure 5Go shows the distribution of the posterior means of sire transmitting abilities under the ZIP model. Under this model, the sire "transmitting ability" esj for the Poisson parameter was 0.31, on average, whereas Pj averaged 0.82, where Pj is the probability that sire j produces a daughter with no mastitis. The correlation between sire ranks based on posterior means of Pj under the Poisson and ZIP models was 0.98. The rank correlation between the top 10, 20, 30, or 50 out of 437 sires ranged from 0.96 to 0.98. This indicates a strong agreement between sire rankings based on ZIP and Poisson models.


Figure 5
View larger version (12K):
[in this window]
[in a new window]

 
Figure 5. Distributions of the posterior means of the Poisson parameter ({lambda}j* = µ + h + sj) given by the "sire j" contribution and of the probability of no mastitis (sire evaluation) under the zero-inflated Poisson (ZIP) model.

 
It would be of interest to relax the assumption of a homogenous p. This assumption may hamper the performance of the ZIP model for levels of "fixed" and "random" effects that do not present extra zeros. Varying p across sires, for example, would allow us to separate sires conferring true resistance to mastitis from those that are only mildly liable to the disease.


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
The posterior predictive check favored the ZIP specification over the Poisson model. The residual analysis showed very small differences between the ZIP and Poisson models at 0 and 1 cases, but slightly favored the Poisson model for 2 or more cases of mastitis during first lactation. Using the probability of no mastitis as a criterion for genetic evaluation, we found a strong agreement between sire rankings from the Poisson and ZIP models. A weaker association between sire evaluations is to be expected in a population with stronger inflation at 0, as well as in the situation in which the ZIP model involves a heterogeneous p.

The inferred mixture probability was 0.32, giving a weight of 0.32 to the perfect state and a weight of 0.68 to the imperfect state, represented by a Poisson model. Hence, 32% of first-lactation Norwegian Red cows are expected to be in the perfect state and not contract mastitis at all, either because of complete resistance to the disease or because they are not exposed to it.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Access to the data was given by the Norwegian Dairy Herd Recording System (Ås, Norway) and the Norwegian Cattle Health Service (Ås, Norway) in agreement number 004.2005. Research was funded by grants CAPES BEX 1758004, USDA 2003-35205-12833, NSF DEB-0089742, and NSF DMS 0443771. Support by the Wisconsin Agriculture Experiment Station and by the Babcock Institute for Dairy Research and Development is acknowledged.

Received for publication December 27, 2006. Accepted for publication July 12, 2007.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 


Barnouin, J., S. Bord, S. Bazin, and M. Chassagne. 2005. Dairy management practices associated with incidence rate of clinical mastitis in low somatic cell score herds in France. J. Dairy Sci. 88:3700–3709.[Abstract/Free Full Text]

Breslow, N. E. 1984. Extra-Poisson variation in log-linear models. Appl. Stat. 33:38–44.[CrossRef]

Brillinger, D. R. 1986. The natural variability of vital rates and associated statistics (with discussion). Biometrics 42:693–734.[CrossRef][Medline]

Carlén, E., M. del P. Schneider, and E. Strandberg. 2005. Comparison between linear models and survival analysis for genetic evaluation of clinical mastitis in dairy cattle. J. Dairy Sci. 88:797–803.[Abstract/Free Full Text]

Chang, Y. M., D. Gianola, B. Heringstad, and G. Klemetsdal. 2004. Effects of trait definition on genetic parameter estimates and sire evaluation for clinical mastitis threshold models. Anim. Sci. 79:355–364.

Chang, Y. M., D. Gianola, B. Heringstad, and G. Klemetsdal. 2006. A comparison between multivariate slash, Student-t and probit threshold model for analysis of clinical mastitis in first lactation cows. J. Anim. Breed. Genet. 123:290–300.[CrossRef][Medline]

Damgaard, L. H., I. R. Korsgaard, J. Simonsen, O. Dalsgaard, and A. H. Andersen. 2006. The effect of ignoring individual heterogeneity in Weibull log-normal sire frailty models. J. Anim. Sci. 84:1338–1350.[Abstract/Free Full Text]

Foulley, J. L., D. Gianola, and S. Im. 1987. Genetic evaluation of traits distributed as Poisson-binomial with reference to reproductive characters. Theor. Appl. Genet. 73:870–877.[CrossRef]

Gasqui, P., O. Pons, and J. Coulon. 2000. An individual modeling tool for consecutive clinical mastitis during the same lactation in dairy cows: A method based on a survival model. Vet. Res. 31:583–602.[CrossRef][Medline]

Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin. 2002. Bayesian Data Analysis. 2nd ed. Chapman and Hall, London, UK.

Gelman, A., and D. B. Rubin. 1992. Inference from iterative simulation using multiple sequences. Stat. Sci. 7:457–511.[CrossRef]

Heringstad, B., Y. M. Chang, D. Gianola, and G. Klemetsdal. 2003b. Genetic analysis of longitudinal trajectory of clinical mastitis in first-lactation Norwegian cattle. J. Dairy Sci. 86:2676–2683.[Abstract/Free Full Text]

Heringstad, B., R. Rekaya, D. Gianola, G. Klemetsdal, and K. A. Weigel. 2003a. Genetic change for clinical mastitis in Norwegian cattle: A threshold model analysis. J. Dairy Sci. 86:369–375.[Abstract/Free Full Text]

Lambert, D. 1992. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1–14.[CrossRef]

Rao, C. R., and I. M. Chakravarti. 1956. Some small sample tests of significance for a Poisson distribution. Biometrics 12:264–282.[CrossRef]

Schukken, Y. H., F. J. Grommers, D. van de Geer, H. N. Erb, and A. Brand. 1990. Risk factors for clinical mastitis in herds with a low bulk milk somatic cell count. Risk factors for Escherichia coli and Staphylococcus aureus. J. Dairy Sci. 73:3436–3471.

Sorensen, D., and R. Waagepetersen. 2003. Normal linear models with genetically structured residual variance heterogeneity: A case study. Genet. Res. 82:207–222.[CrossRef][Medline]

Tempelman, R. J., and D. Gianola. 1996. A mixed effects model for overdispersed count data in animal breeding. Biometrics 52:265–279.[CrossRef]

van den Broek, J. 1995. A score test for zero inflation in a Poisson distribution. Biometrics 51:738–743.[CrossRef][Medline]

Wang, C. S., J. J. Rutledge, and D. Gianola. 1993. Marginal inferences about variance components in a mixed linear model using Gibbs sampling. Genet. Sel. Evol. 25:41–62.[CrossRef]

Wang, C. S., J. J. Rutledge, and D. Gianola. 1994. Bayesian analysis of mixed linear models via Gibbs sampling with an application to litter size in Iberian pigs. Genet. Sel. Evol. 26:91–115.[CrossRef]


This article has been cited by other articles:


Home page
J DAIRY SCIHome page
J. E. Vallimont, C. D. Dechow, C. G. Sattler, and J. S. Clay
Heritability estimates associated with alternative definitions of mastitis and correlations with somatic cell score and yield
J Dairy Sci, July 1, 2009; 92(7): 3402 - 3410.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
M. A. Perez-Cabal, G. de los Campos, A. I. Vazquez, D. Gianola, G. J. M. Rosa, K. A. Weigel, and R. Alenda
Genetic evaluation of susceptibility to clinical mastitis in Spanish Holstein cows
J Dairy Sci, July 1, 2009; 92(7): 3472 - 3480.
[Abstract] [Full Text] [PDF]


Home page
J DAIRY SCIHome page
A. I. Vazquez, D. Gianola, D. Bates, K. A. Weigel, and B. Heringstad
Assessment of Poisson, logit, and linear models for genetic analysis of clinical mastitis in Norwegian Red cows
J Dairy Sci, February 1, 2009; 92(2): 739 - 748.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Rodrigues-Motta, M.
Right arrow Articles by Chang, Y. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Rodrigues-Motta, M.
Right arrow Articles by Chang, Y. M.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS