|
|
||||||||
1 Department of Dairy and Animal Science, The Pennsylvania State University, University Park 16802
2 Department of Biostatistics/School of Public Health, University of Washington, Seattle 98195
3 Department of Animal Science, University of Tennessee, Knoxville 37996
4 Department of Agricultural Education, University of Florida, Gainesville 32611
Corresponding author: A. J. Heinrichs; e-mail: AJH{at}psu.edu.
| ABSTRACT |
|---|
|
|
|---|
Key Words: calf and heifer management multiple imputations longitudinal data analysis
Abbreviation key: AFC = age at first calving, ADG = average daily gain, BCSC = BCS at calving, BWC = BW at calving, MAR = missing at random, MCAR = missing completely at random, MI = multiple imputation, MNAR = missing not at random, MM = mixed models, WHC = withers height at calving
| INTRODUCTION |
|---|
|
|
|---|
Research has shown an association between management and health of dairy heifers. A study of 26 dairy herds in New York by Curtis et al. (1988) indicated that management directly affected the risk of respiratory illness within 14 d of birth. The environment in which the calf is raised also has a profound effect on health and growth. Pritchard et al. (1981) found that treatment for respiratory disease and lung infections in veal calves was directly related to daily weight gains and was associated with air quality in the animal unit. Waltner-Toews et al. (1986) found that a variety of management and housing factors were related to calf and heifer morbidity and mortality. Many of these variables were related to farm size and season. Furthermore, Heinrichs et al. (1987) showed associations among mortality, herd size, and person caring for the calf.
The effect of early calfhood health status on survivorship and age at first calving (AFC), after controlling for the farm effect, has been examined (Waltner-Toews et al., 1986). Heifers treated for pneumonia during the first 3 mo of life were 2.5 times more likely to die after 90 d of age than heifers that had not been treated. Heifers with a calfhood history of being treated for diarrhea were 2.5 times more likely to be sold, and heifers that had been treated for diarrhea were 2.9 times more likely to calve after 900 d (30 mo) of age than other heifers (Waltner-Toews et al., 1986). Calving age <30 d (24 mo) is much more economical because of the extra costs and lost production associated with older calving ages (Gabler et al., 2000; Hoffman and Funk, 1992).
In a large study, Curtis et al. (1989) followed 1171 Holstein heifer calves on several New York dairy farms. Their findings yielded incidence rates for scours of 9.9% within 14 d of birth, 5.2% from 15 to 90 d of age, 7.7% for calves displaying dullness, and 7.4% for calves with respiratory illness. This study was followed by Correa et al. (1988), who evaluated the effects of calf morbidity on AFC on the same animals. Heifers without respiratory illness as calves were twice as likely to calve and calved 6 mo earlier compared with those with respiratory illness as calves. An unexpected result from this study was that heifers displaying dullness or un-thriftyness as calves were 1.6 times more likely to calve and calved 2 mo earlier when compared with calves without dullness as calves. Dullness would be expected to increase AFC because of anticipated lower growth rates from inadequate feed intake and less active or normal behavior.
Health status of dairy heifers has been shown to have a significant impact on growth rate of calves especially during the first 6 mo of life (Donovan et al., 1998). Season of birth and occurrence of diarrhea, septicemia, and respiratory disease can significantly decrease heifer growth (height and weight). Donovan et al. (1998) reported that these variables plus farm, birth weight, and exact age when 6-mo data are collected explained 20 and 31% of the variation in BW and pelvic height growth, respectively, from birth to 6 mo. Septicemia and pneumonia slowed growth by 13 to 15 d (to reach the same weight as healthy calves) during the first 6 mo; diarrhea had a much smaller influence on growth (Donovan et al., 1998). Passive transfer of colostral immunoglobulins had no direct effect on growth but did influence weight and height through its effect on health (Donovan et al., 1998).
Previous work by Place et al. (1998) showed that housing and season had significant effects on average daily gain (ADG). Other variables, such as calving location, parity of the dam, and delivery score at calving, had significant effects on ADG to 4 mo of age. The present study was carried out to follow these same animals beyond 4 mo of age and up to calving. The introduction of missing data between the first and second phases of this study could introduce bias if traditional screening and listwise deletion methods were used. However, because of the intensive nature of the initial phase of the study by Place et al. (1998), a great deal more is actually known about the farms and animals involved in the study, and reliable estimates of missing responses and predictors could be made if newer methods of statistical analyses were used. Therefore, the objectives of the study were to incorporate a statistical technique for missing data, called multiple imputation (MI; Rubin, 1987), to investigate potential factors that affect calving-related measures in dairy heifers and to evaluate the applicability of MI in analyzing field data. Our project was undertaken to study the possible residual effects of calf management practices, nutrition, and environment until early adulthood and how calf-related events might affect AFC, BW at calving (BWC), withers height at calving (WHC), and BCS at calving (BCSC).
| MATERIALS AND METHODS |
|---|
|
|
|---|
During each biweekly visit, animals were identified, and health records were updated or collected for the previous 2 wk. Body weight and withers height were recorded until 4 mo of age for each calf. Individual feed intake was measured at each visit, and feed samples were collected for analysis as previously reported (Place et al., 1998). At each visit, measurements were taken for NH3 concentration, current humidity and temperature, and maximum and minimum temperatures in each housing area throughout the 2 wk prior to the visit. Temperature and humidity were determined with a digital hygrometer and thermometer (Fisher Scientific, Pittsburgh, PA), and NH3 determination was via a Kwik-Draw basic ammonia detector pump (MINE Safety Appliances Co., Malvern, PA).
Farm management practices were recorded using existing management survey instruments from the National Dairy Heifer Evaluation Project (Heinrichs et al., 1994). Information also was obtained from DHIA, calving and breeding records, and herd veterinarians, as needed.
Following the initial 18-mo phase of the study, farms were visited every 3 mo to follow health events, breeding, and animal movement. Once heifers were near calving, farms were visited every 2 to 4 wk to collect calving information. Data collected within 2 wk of calving included age at calving, BWC, WHC, BCSC (Edmondson et al., 1989), and health events occurring at calving. Data collection concluded on each farm when all identified heifers had calved.
Three of the original farms dropped out of the second phase of the study for reasons unrelated to the study. Also, only animals that survived to the completion of the first phase of the study (112 d) were used. The number of calves for the second phase was, therefore, reduced to 686 on 18 farms. All procedures involving animals were in accordance with approved guidelines of The Pennsylvania State University Institutional Animal Care and Use Committee.
Multiple Imputation
Statistical analyses were performed using MI. Multiple imputation is a method of dealing with missing data where plausible values replace missing data. However, MI differs from ad hoc methods, such as imputing the mean, because MI replaces each missing value multiple times; as a result, the error structure is preserved so that valid inferences can be made. Rubin (1987) first introduced the method of MI and developed the rules and assumptions to follow. Reviews (Allison, 2002; Schafer, 1999; Schafer and Olsen, 1998) and a detailed application of the method (Schafer, 1997) have been published. Although many methodological points could be considered (Brand et al., 2003; Horton et al., 2003; Kim, 2004), generally MI results are considered better than other ad hoc, single imputation methods of dealing with missing data because the error structure is preserved (Little and Rubin, 1987; Rubin, 1987, 1996). Currently, MI is commonly used in medical and social sciences, and software exists to carry out the analysis.
In a standard MI analysis, data augmentation (Tanner and Wong, 1987) is used to impute random values for missing values based on the joint distribution of all variables in the data set. This is done several times and results in multiple complete data sets on which the analysts model can be applied. The set of variables used for data augmentation is referred to as the imputation model. Each of these data sets are then analyzed individually as complete data sets, using what is referred to as the analysts model, and the results of the separate analyses are combined in a statistically valid manner using Rubins Rules for Scalar Estimands (Rubin, 1987).
As in all statistical methods, MI requires that certain assumptions be made. The most crucial of these assumptions is the mechanism for missing data, which can be categorized as missing completely at random (MCAR), missing at random (MAR), or missing not at random (MNAR). This categorization is based on the work of Rubin (1976); see also Diggle and Kenward (1994), Diggle et al. (1994), and Little and Rubin (1987). Missing completely at random applies when the likelihood of a missing response depends on neither observed nor unobserved values of other variables in the analysis. It is a special case of random sampling and is actually the assumption made when using methods such as case deletion. Missing completely at random is rarely a plausible assumption when actual experimental data are used.
For the second case, MAR, the likelihood of a missing response depends on other observed values of variables in the data set. In this case, the observed data are considered the result of random sampling within subclasses that are defined by the observed data. For instance, BCSC data may be missing for a heifer because she was sold prior to calving. Rarely does a farmer sell an animal based on a random sampling method, so the mechanism is certainly not MCAR. However, if the relationship of the missingness of the BCSC data can be explained using other variables in the data set, for instance genetic or management data, the missingness mechanism can be considered MAR. Thus, the complete data are considered to be a random sample of all animals of that subclass of genetic and management data. Missing at random is the mechanism assumed in MI as well as maximum likelihood.
The third category of missingness, MNAR, applies when the likelihood of an observed response depends on both observed and unobserved data. In this case, if the data were analyzed and then the complete data became available, results of the first and second analyses would not agree. This would be true even in cases where the observed covariates are considered. To continue with the previous example, farmers may cull heifers because of genetic or management factors; however, if these data were not available for the analyses and, therefore, could not be included, the results of any analyses of observed first lactation animals would be biased, and the missing data would be MNAR.
Another way to categorize missing data is to classify it as ignorable and non-ignorable (Rubin, 1987). Ignorability refers to the process of the missing responses and tells us whether or not we need to model it. Ignorable data include MCAR and MAR; non-ignorable data pertain to MNAR. The exact definition is found in Rubin (1976). The term ignorable actually refers situations when appropriate methods are used in the analyses of the data; then, inferences can be drawn that pertain to the entire population of interest, including cases where data are missing.
In many field studies, the actual mechanism leading to missing data is not completely known. Almost always, there is no method to prove that a missing process in a data set is MCAR or MAR; so, missing data techniques can be applied with 100% certainty. However, no model can be applied with 100% certainty, and it is up to the researcher to consider the plausibility of alternative models.
In the implementation of MI, it is very important to make the distinction between the imputation model and the analyst model. The analyst model is considered the traditional statistical model used for inferences and is the second stage. The imputation model contains all response and predictor variables and any interactions that may be of interest in the analyst model to be used. In addition, variables can be included in the imputation model that are not of interest in the analyst model, which can add information that helps to satisfy the MAR assumption and can increase the precision of the imputed values (Meng, 1994; Collins et al., 2001). However, it is very important that no term is allowed into the analyst model that was not in the imputation model.
In the current study, not only was there more information collected than of interest in these analyses, survey data were available for each farm that directly pertained to management practices and outcomes. For example, the survey data include the typical AFC for a farm, so that information is included in the joint distribution as well as the AFC of farm cohorts. Therefore, the imputed values should be more accurate than if we did not have this added information. This leads to the kitchen sink philosophy, and imputation models tend to be much larger and more complex than subsequent analyst models.
Results of the analyses performed on the imputed data sets are combined using Rubins Rule for Scalar Estimands (Rubin, 1987). The degrees of freedom for inferences are determined by
![]() |
where
![]() | = | degrees of freedom,
| m | = | the number of imputations,
| ![]() | = | the complete data variance estimate, and
| ![]() | = | the between-imputation variance estimate
|
All data imputation and analyses were conducted in S-Plus 6.1 (2001; Insightful Corp., Seattle, WA).
Mixed Model Analyses
The data were analyzed with a linear mixed model (MM), where farm was considered a random component using PROC MIXED in SAS (1999). PROC MIXED uses maximum likelihood for missing response variables, but uses listwise deletion for missing predictor variables. In the analyses where MI was not used, only 412 to 430 animals of the original 686 observations available for the second stage of this study were available for estimates. For the analyses using multiply imputed data sets, 5 data sets were created. The same linear mixed model analyses were run for each of the 5 data sets, and the results were combined using Rubins rules for appropriate inferences. Additional information available as an outcome of this method are the complete data variance, the between-imputation variance, and total variance. From these variances, the appropriate degrees of freedom for inference testing can be estimated. Responses considered related to growth were AFC, BWC, BCSC, and WHC. The fixed effects in the model were as follows.
Delivery score = difficulty of the heifers birth (1 = unassisted, 2 = easy pull, and 3 = hard pull, mechanical extraction, caesarean section)
Parity of the heifers dam (0 = first-calf heifer; 1 = older cow)
Days ill = number of days a heifer had scours or cough during her first 4 mo of life
Days treated = number of days during first 4 mo of life that heifers had scours or cough requiring antibiotic treatment
DMI at weaning
Maximum milk DMI on a BW basis in the first 16 wk
Grain DMI on a BW basis at 16 wk of age
Week of age that heifer first consumed 0.91 kg grain (DM)/d
Phosphorus intake from grain at weaning
ADF content of forage fed at 4 mo of age
Maximum humidity heifer was exposed to during first 4 mo of life
Mean temperature heifer was exposed to during first 4 mo of life
Maximum NH4 heifer was exposed to during first 4 mo of life
Significance in all analyses was declared at P
0.05, and trends were declared at P
0.10.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
The results of the second phase of the study using traditional MM analysis and MI analysis are shown in Table 1
. Comparing the MM and MI analyses, the MI analysis uses the full data set with imputations and, therefore, often has a higher level of significance than the MM analysis. In some cases, this changes the level of significance from a trend difference to a significant difference. The estimates from the MM and MI models are very similar with similar standard errors, yet because of the use of the entire data set rather than only the data from complete observations, the MI analysis has more power and, therefore, has a higher level of significance. In most cases, the levels of significance do not change with the different analyses. In one specific case where a large amount of data was missing and this missing data was farm specific, the P intake of the young calf appeared to have a significant effect on AFC using the MM analysis. There is no logic or previous research to justify a negative influence of P intake on AFC. Indeed, using the MI analysis, where all data from all farms are used, this variable is shown as highly insignificant (P = 0.95). Differences in the results of MM and MI analyses are greatest for BWC (Table 1
), where many of the predictor variables differ in level of significance. Some of these effects are illogical and are likely explained more accurately using the MI approach.
|
Effect of Dam Parity
A trend for increased BWC and BCSC of heifers born to older cows was observed. These heifers could have been larger from birth because of maternal factors or selection of sires. No effect of dam parity was noted for AFC or WHC.
Illness and Antibiotic Treatment Effects
Health of calves during the first 4 mo of life and antibiotic treatment for scours or pneumonia exhibited interesting effects. There was a trend (P = 0.06) for greater days treated to increase AFC and a significant positive effect on WHC. The increase in AFC would result from illness, and the positive effect on WHC may be accounted for by the fact that heifers were older yet not heavier and, therefore, likely more near mature structural development. Body weight at calving and BCSC were not affected by days treated.
Feed Intake and Quality Effects
Dry matter intake at weaning did not affect AFC, BWC, BCSC, or WHC. Maximum milk intake positively influenced BCSC and AFC, although the MM and MI analyses did not agree. Maximum milk intake did not affect WHC. The age at which heifers began consuming 0.91 kg of grain and grain intake at 16 wk did not affect AFC, BWC, BCSC, or WHC using the MI method. Intake of P from grain had a slight positive effect on BWC. Forage ADF, which represents the amount of fiber and, therefore, inversely the energy level of the forage, showed that farms and heifers with poorer quality forage tended to have increased AFC, as would be expected. There were not any effects on BWC, WHC, or BCSC. Level of nutrition of the calf would be expected to affect growth rate positively with respect to milk intake and negatively with respect to high fiber, low energy diets (NRC, 2001). The lower energy forage fed to calves also could be indicative of forage quality for all heifers on the farm, but not necessarily. If breeding decisions were being made by BW or other growth measures, then it is logical that AFC would be the only variable affected by lower levels of nutrition.
Effects of Calf Housing Environment
The final 3 variables tested were related to calf housing environment. Higher humidity and temperature created an environment that increased AFC. In part, this could be due to increased subclinical disease or stress. Although the increases in AFC were not large for each variable, they were significant and represent a real economic disadvantage to the farms where these environment indicator levels were high. Temperature, humidity, and NH3 levels also significantly affected BWC, BCSC, and WHC in some cases; however, these increases can probably be attributed to older AFC.
| CONCLUSIONS |
|---|
|
|
|---|
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Received for publication February 14, 2005. Accepted for publication April 11, 2005.
| REFERENCES |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |