|
|
||||||||
1 Department of Large Animal Sciences, The Royal Veterinary and Agricultural University, Grønnegårdsvej 8, DK-1870 Frederiksberg C, Denmark
2 Biometry Research Unit, Danish Institute of Agricultural Sciences, P.O. Box 50, DK-8830, Tjele, Denmark
Corresponding author: Nils Toft; e-mail: nt{at}dina.kvl.dk.
| ABSTRACT |
|---|
|
|
|---|
Key Words: paratuberculosis ELISA Bayesian network test evaluation
Abbreviation key: FC = fecal culture, FChigh = fecal culture high, FClow = fecal culture low, FCneg = fecal culture negative, Map = Mycobacterium avium ssp. paratuberculosis, OD = optical density, Pr = probability.
| INTRODUCTION |
|---|
|
|
|---|
Often the test is carried out merely to determine whether a specific condition is present to initiate a suitable intervention. For this purpose, dichotomizing the disease definition and test result is adequate. It might be worthwhile, however, to improve this approach when the disease or condition and the associated threshold is ambiguous, thus allowing tests to be used in a wider range of settings.
Consider, as an example, paratuberculosis: a chronic, slowly developing infection in cattle and other ruminants caused by Mycobacterium avium ssp. paratuberculosis (Map; Chiodini et al., 1984). The chronicity of infection makes simple definitions of disease difficult. Furthermore, the sensitivity of diagnostic tests varies depending on the stage of the disease (Nielsen et al., 2002c). A frequently adopted intervention for infected cows is culling as opposed to doing nothing. However, some infected cows never develop "clinical" paratuberculosis with diarrhea and concomitant emaciation. Some managers only cull those cows that experience clinical disease, whereas other managers would like to detect and cull subclinical animals that might transmit Map to herd mates, are less productive, or both. Thus, a general framework to devise an optimal test-and-cull policy for paratuberculosis would benefit from a test that allows for multiple classifications of infection. Nielsen et al. (2002b) suggested 3 stages of Map infection: noninfected cows, infected cows with predominating cell-mediated immune responses, and infected cows with predominating humoral immune responses. The infected cows with predominating cell-mediated immune responses are assumed to have reduced antibody titers during the primary cell-mediated immune responses. During humoral immune responses, antibody titers are expected to be elevated. Hence, the 3 "infection groups" defined above may be assumed to correspond to 3 "immuno-groups": noninfected (having no antibodies); infected, with reduced antibody titers; and infected, with elevated antibody titers. Validating such immuno-groups requires repeated testing using fecal culture (FC), which is time consuming and expensive. Fecal culture generally takes 12 wk and sampling requires extra work compared with a milk-based indirect ELISA. However, given that a link between the immuno-groups and the infection groups exists, antibody testing could provide a tool for inference about the amount of bacterial shedding, and hence, be used as decision support to livestock producers rather than using the more cumbersome and expensive FC method.
Other influences (such as parity and stage of lactation) on optical densities (OD) of the milk ELISA have been demonstrated previously (Nielsen et al., 2002c). Statistical models should include these covariates to improve the interpretation of the test result. Furthermore, such models should be easily adapted to repeated measures of test results on individual cows because this might be an element of a future testing regimen. Standard statistical methods such as mixed linear normal models (e.g., PROC MIXED, v. 8.2; SAS Inst., Inc., Cary, NC) are well established for inference from such models. Rather than OD given infection status, for diagnostic inference the conditional distribution of the infection status given the observed OD and relevant covariates of the cow is required. This becomes possible with comparatively little effort using probabilistic expert systems, such as Bayesian networks (Cowell et al., 1999).
The objective of the current study was to demonstrate how continuous-test data (and covariates) can be used for diagnostic testing when diseases are allowed to have multiple stages, as exemplified by an ELISA used for classifying lactating dairy cows into the 3 infection stages of paratuberculosis as defined above.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Test Methods and Classification Scheme
Fecal samples were cultured for 12 wk and classified as either negative or positive with varying degrees of bacterial growth. Positive cultures were confirmed by PCR detecting IS900. The test was described in detail in Nielsen et al. (2004). Based on these classifications, the cows were trichotomized according to their Map infection status: 1) cows in 5 herds that never had any culture-positive cows were assumed free of paratuberculosis and all tested cows from these herds were classified as fecal-culture negative (FCneg); 2) from the 8 herds with known problems, FC-positive cows with some negative cultures and whose positive cultures always had only few counts of bacteria (< 10 cfu/g of feces) were defined as fecal culture low (FClow); and 3) from the 8 herds with known problems, FC-positive cows with elevated bacterial counts (
10 cfu/g of feces) or many repeated FC-positive tests (with any nonzero counts of bacteria) were defined as fecal culture high (FChigh). The latter group of cows can be perceived as those failing to control the Map infection.
Culture-negative cows from the 8 herds with known problems were designated "fecal-culture unidentified" and excluded from further analysis. The poor sensitivity of the FC prohibited a reasonable diagnosis of these cows. Based on this classification, 3983 results from 737 cows were included in the study.
The milk samples were tested in an ELISA for antibody based on Mycobacterium paratuberculosis strain 18 (an M. avium ssp. avium strain) from Allied Monitor (Fayette, MO). This test was described in detail previously (Nielsen et al., 2001, 2002c). The test results of the samples were classified based on their OD value from the ELISA reader. The OD was corrected for interplate variation by subtracting the OD of a negative control tested in each ELISA plate. Information on DIM, current parity, and the age at first calving of the tested cows was extracted from the Danish Cattle Database.
Statistical Models
Initial analyses showed that the variance of the OD values increased with increasing mean value. Thus, the corrected OD were log-transformed to stabilize their variance. Initially, a mixed linear normal model was formulated using random effects to take into account repeated measurements. Subsequent tests of model assumptions showed that variance homogeneity could not be assumed, and a more detailed variance model had to be formulated. The detailed model specification follows.
Paratuberculosis is a chronic infection, and infection is often assumed to occur during calfhood. Thus, age was included in the model by using the proxies parity, stage of lactation (Nielsen et al., 2002a), and age at first calving. Because the association between log OD and age did not seem to be linear, however, DIM and age at first calving were coded as ordinal variables. Days in milk was divided into 0 to 1, 2 to 11, 12 to 27, 28 to 40, and > 40 wk after calving. Age at first calving was dichotomized into
28 and > 28 mo. Parity was modeled as first, second, and third or greater parity. Although these variables were expected to influence OD for FClow and FChigh cows, the same effect was not expected for FCneg cows. Therefore, interaction between FC classification and the age covariates was expected. Furthermore, variance heterogeneity was expected between FC types, necessitating residual variances being allowed to vary. Variation among cows within herds was included as a random effect of cows nested within herds.
The initial model contained all main effects and their first-order interactions and assumed homogeneous variance. Model selection was performed by adding variance heterogeneity among subgroups when significant (tested by the likelihood-ratio test,
= 0.05) and sequentially removing nonsignificant (using a 2-tailed test with
= 0.05) fixed effects (by using SAS type III test using Satterthwaites approximation for calculation of the degrees of freedom for the test). This procedure produced the following model for the log-transformed OD [log(OD)]:
![]() | ([1]) |
In this model, log(OD)himn is the log-transformed corrected OD of the nth recording of the mth cow in herd h within the ith FC type. FCi is the systematic effect of the ith FC type, i = 1, 2, 3; PJhimn is the systematic effect of parity, J = 1, 2, 3, with J = 3 indicating cows with parity > 2; DIMKhimn is the systematic effect of the Kth DIM group, K = 1, ..., 5; ACLhim is the systematic effect of the age at first calving, L = 1, 2 (subscripts Lhim indicate that age at first calving is constant for a cow); [COW(FC x HERD)]him ~ N(0,
2C) is the random effect of cow within herd and FC type; and
himn ~ N(0,
2
[exp(U
)]iJhimn), i.e., residuals are identically Normal distributed within each combination of FC type and parity with
2
as the common intercept term of the residual variance, U as the design matrix reflecting the combination of FC type and parity and
as an 8-dimensional vector of estimated parameters.
![]() |
Thus, the expression
2
[exp(U
)] gives a 9-dimensional vector: the first 3 elements give the variance for parity 1, 2, and
3 for FCneg cows, the next 3 elements give the same variances for FClow cows, and the last 3 give variances for FChigh cows. Hence, given estimates of
2
and
the variance for FCneg and first-parity cows can be calculated as
2
exp(
1 +
3 +
5), the variance for FClow, first parity as
2
exp(
2 +
3 +
7), etc.
Bayesian Network Model
Parameter estimates from the statistical model from equation 1 were used directly to form a Bayesian network (also known as a probabilistic network; Cowell et al., 1999). The qualitative part of this Bayesian network is shown in Figure 1
. Bayesian networks may be constructed from expert knowledge concerning the domain. However, a network that corresponded closely to the statistical model with intermediate nodes to model the 2-way interactions was preferred. A special kind of Bayesian network, a Continuous Gaussian graph, where nodes are allowed to be either discrete (ellipses with solid borders) or continuous (Gaussian; ellipse with double lined border) nodes were used (Cowell et al., 1999). The arrows from one node (the parent) to another (the child) indicate a description of the conditional distribution of the values in the node given the values of its parents. For example, in Figure 1
, DIM and FC are "parents" of the D x FC node. The discrete nodes were used for modeling the categorical design variables from the model (e.g., whether parity was at level 1, 2, or
3). The first level of continuous nodes was used to represent the parameter estimates and the further levels to represent the subsequent summation of the random and fixed effects. The intermediate summation might have been omitted but facilitates the modeling when repeated measurements on the same cow are modeled. Each continuous node has an associated table with the means and variances for the Normal distribution for each possible configuration of the parent nodes, thus utilizing the standard errors as well as the estimated parameters. For example, the line in the table describing the distribution of the continuous node (FC x P) with fecal type FCneg and first parity would contain the estimate of the parameter (P x FC)(1,FCneg) = 0.08 from the model in equation 1 as mean and the square of the corresponding SE (= 0.03) as variance (Table 1
). The res node (representing the residual) would have a mean of 0 and the square of the SD [= 0.26 for fecal type FCneg and parity 1 (Table 2
)] as variance. In Figure 1
, the intercept and the main effects are pooled within the interactions.
|
|
|
Because age at first calving, parity, and stage of lactation are known, evidence will overrule the prior, and uniform distributions may be used. The prior distribution on the proportion of FCneg, FClow, and FChigh cows should be given more attention, ideally reflecting the distribution in the population where the Bayesian network will be applied. Distribution in data, however, is not necessarily representative; a herd-specific prior might even be required. Hence, for the illustrative purposes of this study, the prior distribution of FC type is also assumed uniform.
| RESULTS |
|---|
|
|
|---|
|
The estimate of the herd effect was
2C = 0.0712 (SE = 0.0053; P < 0.0001). Given the estimates of the common intercept term of the residual variance (
2
= 0.14) and the vector
= (0.84, 0.32, 0.12, 0.30, 0.02, 0.96, 0.30, 0.18) the residual variance for the individual combinations of FC type and parity was calculated as:
![]() |
The square roots of these estimates are given in Table 2
. The estimated SD showed more variation in the log OD for FChigh than FClow and FCneg cows within each parity group. The main difference in variance, however, was due to differences between FCneg and the other 2 classes (as expected). Variation between measurements of cows greater than second parity seemed to be less than between first-parity cows.
Using the estimates of Table 1
, the random herd effect, and the calculated residual SD (Table 2
), the quantitative part of the Bayesian network shown in Figure 1
was constructed. Using this network, the distribution of FCneg, FClow, and FChigh for any given log OD, parity, age at first calving, and DIM could be estimated. As each combination gives a different probability (and log OD is a continuous variable), the results of these propagations are best presented as graphs (Figure 3
) using stacked plots to emphasize that Pr(FCneg) + Pr(FClow) + Pr(FChigh) = 1 for any given combination of log OD and age covariates. In Figure 3
, the probabilities of FC type for a given log OD are given for 9 different combinations of age covariates (the remaining combinations showed similar results, but were omitted to save space). For a given log OD, Pr(FCneg), Pr(FClow), and Pr(FChigh) differ substantially among parities. This is perhaps best seen when comparing small log OD and the associated probability Pr(FCneg) cow. Although the result for first-parity cows seemed inconclusive, the result for cows of third or greater parity indicated that small log OD are more likely to be the result of an FCneg cow.
|
| DISCUSSION |
|---|
|
|
|---|
Using a Bayesian network representation of the statistical model allowed the uncertainty associated with the estimates of the statistical model to be represented as variance on the Gaussian nodes. Hence, the estimates of probabilities provided by the Bayesian network are estimates based on the estimated parameters and their associated uncertainty. This implies that the resulting probabilities gives a more realistic picture of the properties of the ELISA as a tool for classification of cows with respect to their bacterial shedding status than a traditional approach in which the misclassification is presented in terms of sensitivity and specificity of a test. Although the uncertainty associated with these estimates is often acknowledged, it is usually not used in the further interpretation of the results.
When using a trichotomized infection status rather than the usual dichotomization, the concepts of sensitivity and specificity lose their intuitive interpretation. Similarly, so does the use of likelihood ratios such as in Collins (2002), although, if the objective only was to handle continuous tests, they could be used. The Bayesian network and graphical representation of probabilities, however, could just as well be adapted when the traditional dichotomous disease classification is used (or generalized into more than 3 disease stages). Using Bayesian networks for decision support does not require construction of stacked probability plots, which are only used here for illustrative purposes. The statistical model showed that covariates influence the relationship between FC type and OD. Such covariates should be taken into account when interpreting the diagnostic test. This is further discussed in Nielsen and Toft (2002) for a traditional dichotomous disease definition. Consequently, computer-based methods are needed when interpreting the OD. This problem is not unique to the current study and rather than trying to further justify the use of covariates for test interpretation, it is probably more appropriate to challenge the justification in ignoring these known covariates. Visual comparison of the empirical distribution in Figure 2
and the individual distributions of Figure 3
seemingly shows an effect (gain) by including covariates. The plots in Figure 3
differ visually between different configurations of parity and DIM category. Whether these fluctuations are large enough to alter a culling decision based on an OD is another question that will be left for future studies.
The diagnostic properties of tests for Map are rather poor when used for just 2 stages of infection, and adding a third infection stage is unlikely to improve this. Throwing away information by dichotomizing or trichotomizing the test interpretation will further decrease the value of the test. To illustrate this, consider the situation shown in the lower left plot of Figure 3
and assume that 2 cut-offs were introduced so that log OD
0.5 was interpreted as FCneg, 0.5 < log OD
0.5 as FClow, and 0.5 < log OD as FChigh. Using this approach, it would be possible to estimate parameters describing the probability of correctly classifying into each of the 3 categories. However, each log OD below 0.5 would have the same probability of being correctly (or wrongly) classified, whereas using the log OD directly would give more confidence of a cow being FCneg when the log OD equals 1.0 compared with 0.55.
Eventually, it comes down to whether an interpretation of a cow being 70% FCneg, 20% FClow, and 10% FChigh is manageable. The alternative is to classify the cow as 1 of the FC types and accept that misclassification occurs. When used in a computer-based decision support system, there is no need to know the exact state of the cow with respect to infection with Map. Thus, there seems to be more benefit in representing the uncertainty associated with a given test result directly than through the traditional concept of test properties (such as sensitivity and specificity, which cannot be used here). Still, for a given set of conditions there is an optimal cutoff value for the OD in which all test results above that OD results in an action (such as culling). The point is that a continuous test (such as OD) combined with additional information allows this cutoff value to be assessed for different combinations of age at calving, parity, DIM, and with respect to a specific purpose of the test (i.e., control vs. eradication).
Testing and culling dairy cows according to their Map infection status should ideally be combined with the traits traditionally used to assess the value of a dairy cow in replacement models (e.g., milk yield, reproductive problems, and age); this is reviewed by Kristensen (1994). In Houben et al. (1994), a model that includes health traits such as mastitis were used for optimal replacement of dairy cows. Recently, Gröhn et al. (2003) developed a model that considered several different diseases when optimizing the replacement of dairy cows. These replacement models all assume that relevant information can be observed with certainty. This assumption, however, cannot be justified when addressing paratuberculosis. The true bacterial shedding state (or FC type) of the cow is generally not known and requires extensive and repeated testing (as described earlier in this paper) due to intermittent shedding and low diagnostic sensitivity in the early stages of infection. Fecal sampling and culturing is also more expensive and time consuming compared with milk sampling and testing using ELISA. Milk samples are already retrieved within the Danish milk-recording scheme for 88% of the Danish dairy herds. Thus, it might be tempting to apply the OD directly in the modeling without using the FC types. However, when modeling the effects of paratuberculosis on the risk of transmitting the infection to other animals, the FC types make more biological sense than the OD. Furthermore, methods for handling the partial observability within a traditional sequential decision-support framework with an infinite time horizon are being developed (Nilsson and Kristensen, 2002).
The present study uses a classification of cows into 3 different categories with respect to paratuberculosis infection status. The procedure used for classification as well as the interpretation of the 3 groups can be questioned. The low sensitivity of the FC might introduce potential bias. However, the primary purpose of this study was to demonstrate that the use of dichotomized disease status to interpret a test result and the use of sensitivity and specificity to evaluate the assay method has shortcomings that make it worthwhile to consider other ways of interpreting and representing the uncertainty of diagnostic tests. The framework presented herein can easily be extended to allow for repeated tests, multiple diseases, or both. If the disease status cannot be determined by elaborate test schemes, then the statistical analysis needs to be carried out as a latent class analysis, such as the mixture model used in Nielsen et al. (2003).
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Received for publication January 25, 2005. Accepted for publication July 15, 2005.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. S. Nielsen and A. K. Ersboll Age at Occurrence of Mycobacterium avium Subspecies paratuberculosis in Naturally Infected Dairy Cows J Dairy Sci, December 1, 2006; 89(12): 4557 - 4566. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |