|
|
||||||||



* Department of Dairy Science,
Department of Biostatistics and Medical Informatics, and
School of Veterinary Medicine University of Wisconsin, Madison 53706
1 Corresponding author: kweigel{at}wisc.edu
| ABSTRACT |
|---|
|
|
|---|
Key Words: machine learning alternating decision tree fertility dairy cattle
| INTRODUCTION |
|---|
|
|
|---|
Numerous measures of reproductive performance have been proposed, including days to first insemination, first-service conception rate, days open, calving interval, services per conception, and 21-d pregnancy rate, among others. Each offers advantages and disadvantages, but a critical factor is the robustness of such variables to differences within and between herds in the duration of the voluntary waiting period (VWP) and use of hormonal products for ovulation synchronization (Pursley et al., 1997). Pregnancy status at 150 DIM (i.e., pregnant or nonpregnant) is a composite trait that encompasses estrus-detection efficiency and breeding efficiency and is relatively robust to heterogeneity in the extent of hormonal synchronization and duration of the VWP.
The field of machine learning offers many flexible algorithms that are suitable for analysis of large, complex data sets. Conventional statistical methods, such as regression or ANOVA, require the assumption of a specific parametric function (e.g., linear, quadratic, etc.), and large quantities of data must be discarded if one or more explanatory variables are missing. Machine learning algorithms, on the other hand, can accommodate intricate dependencies among explanatory variables and can function effectively in the presence of missing values for some variables. Therefore, the application of such algorithms for analysis of herd management and performance data or computerized decision making on commercial dairy farms seems very promising (Pietersma et al., 1998).
The objective of this study was to use one type of machine learning algorithm, the alternating decision tree, to identify factors that affect the first-service conception rate or pregnancy status at 150 DIM in high-producing Holstein cows on large dairy farms, using a comprehensive data set with information regarding management, housing, labor, nutrition, genetics, and climate for specific farms and individual cows on these farms.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Information regarding potential herd-specific and cow-specific explanatory variables was compiled from numerous sources. Test-day milk yields were collected from the DHI milk-recording program and were provided by a dairy records processing center or extracted directly from on-farm DHI-Plus (DHI, Provo, UT), Dairy Comp 305 (Valley Ag Software, Tulare, CA), or PCDART (Dairy Records Management Services, Raleigh, NC) herd management software, as were data regarding insemination dates and pregnancy examination dates and outcomes. Information regarding the maximum daily ambient temperature on the day of insemination was obtained (via the National Climatic Data Center) from weather stations located within 40 km of these farms, and a single trained evaluator assessed BCS for cows in 63 herds in the Upper Midwest from May to August, 2004. Last, a comprehensive survey was completed by the managers of 103 farms, with the help of Alta Genetics Advantage consultants. This lengthy survey included questions regarding 10 aspects of each dairy operation, including general management issues, sire selection, reproductive management, inseminator training and techniques, heat abatement devices, body condition scoring, facility design and pen movement, nutrition, employee training and management, and cow health and biosecurity. A detailed summary of these data, including means, frequencies, and other descriptive statistics is provided in Caraviello et al. (2006).
The first dependent variable considered in this study was first-service conception rate. In constructing the data set for this variable, only lactation records with at least 1 service were considered. Milk yield was required for a lactation to be included in the analysis, and the test-day milk weight nearest the date of AI was identified for each lactation record. Lactation records that had no reported outcome (i.e., those that lacked a repeat breeding, a negative pregnancy diagnosis, or a positive pregnancy diagnosis) within 75 d after the first service were excluded. A minimum of 14 d was required between consecutive inseminations of the same cow. After editing, the data set for the first-service conception rate contained 31,076 lactation records from 14,804 cows, along with a total of 317 potential explanatory variables.
The second dependent variable considered in this study was pregnancy status at 150 DIM (i.e., pregnant vs. nonpregnant). This variable is a composite indicator of estrus-detection efficiency (i.e., service rate) and breeding efficiency (i.e., conception rate). Because it is not measured until midlactation, this variable is relatively robust to differences between herds in duration of the VWP, as well as to differences between herds in the method and extent of hormonal synchronization before first service, resychronization of later services, or treatment of noncycling cows. In constructing the data set for pregnancy status at 150 DIM, each cow was required to have at least 1 AI in a given lactation, or to have passed 150 DIM without being inseminated. After editing, the data set for pregnancy status at 150 DIM had 17,587 lactation records from 9,516 cows, as well as 341 potential explanatory variables.
Because some categorical explanatory variables did not apply to every farm that responded to the survey, a unique value designating "nonapplicable" was assigned to these responses. Other missing or incomplete responses that did not fit into the nonapplicable category were kept as missing values.
Methodology
After evaluating several machine learning algorithms, including decision trees, Bayesian networks, and instance-based algorithms in a related study, Caraviello (2005) determined that the approach most readily applicable to this study was the alternating decision tree algorithm (Freund and Mason, 1999). Decision trees are hierarchical sets of "ifthen" tests (known as nodes) that are used to classify records (or instances) into their most likely outcomes based on various explanatory variables (or attributes). The alternating decision tree is a particular type of tree that relies on a boosting algorithm to improve performance (Freund and Schapire, 1996), and is computationally less demanding than many modern general-purpose machine learning classifiers for large, complex data sets. As compared with other algorithms, the alternating decision tree algorithm tends to build smaller trees that can be more readily interpreted by the end user.
The alternating decision tree algorithm is an iterative procedure in which potential decision nodes corresponding to specific explanatory variables are evaluated at various "cut points" and then added to the decision tree sequentially. Cut points for binary variables may correspond to the presence or absence of a given characteristic (e.g., use of sprinklers for heat abatement), whereas cut points for categorical variables may represent logical combinations of levels (e.g., purchased cows from a cattle dealer, auction barn, or via private sale), and cut points for continuous variables may represent key thresholds (e.g., durations of VWP <60 vs.
60 d). At each iteration, the cut point for a given variable is chosen such that the number of correctly classified records in the training set is maximized while the number of incorrectly classified records in the training set is minimized. Furthermore, "heavy branches" of the tree (i.e., branches or paths that are reached by a large number of records) are expanded preferentially, and this ensures that effort is not wasted in attempting to discriminate among a few unique, outlier records.
Each individual record flows through the decision tree to determine its classification, and decision nodes within the tree determine the various paths (i.e., branches) to be followed. When the record reaches a particular decision node, it follows the path of the "child" corresponding to the outcome of the "ifthen" decision rule. A given node may or may not be reached by every record (i.e., if a particular variable is missing or not applicable, or if a condition present at an ancestral node is not satisfied). Therefore, there is no need to discard records with one or more missing explanatory variables, and this can avoid a tremendous loss of data. Nodes are numbered, indicating the order in which explanatory variables were added to the tree (the first variables added tend to be those that provide the highest proportion of correctly classified records). Nodes at the first level of the tree are independent, but the significance of nodes at the second and deeper levels depends on decisions obtained at "ancestral" nodes at higher levels within the tree. In this manner, decision trees can account for complex dependencies and interactions among explanatory variables.
The alternating decision tree algorithm also associates a prediction value with each node, and this allows each node to be interpreted independently from other nodes. More important, the value associated with each prediction node encountered along the path taken by a given record is added to that records final summation, and the final summation subsequently determines the classification status of the record (pregnant vs. non-pregnant in this case). If the assumed threshold is zero, as is normally the case for a binary outcome such as pregnancy status, the classification of a record will be positive (i.e., pregnant) if the final sum is greater than zero and will be negative (i.e., nonpregnant) if the final sum is less than zero. Each record follows multiple paths as it goes through the tree, so the lack of 1 explanatory variable does not prevent classification, because only the values of nodes that can be reached will contribute to the final sum. Furthermore, because each node provides only a small contribution, it is unlikely that a missing value will have a major impact on the final sum, especially if it corresponds to a node that has few descendents or a node that is far from the root of the tree.
The absolute value of the final sum determines the classification margin, a measure of confidence associated with the classification of a given record. The confidence level for a positively classified record increases as the overall sum increases (i.e., as the sum becomes more positive), and the confidence level for a negatively classified record increases as the overall sum decreases (i.e., as the sum becomes more negative). In some applications, the classification margin is as important as the classification itself. The receiver operating characteristic curve allows one to choose the most appropriate threshold (not necessarily zero) for a specific application, and the area under this curve can be used, along with cross-validation results, as a measure of the overall predictive ability of the algorithm.
The Weka package (Witten and Frank, 2000), implemented in Java, was used to develop the alternating decision trees discussed herein. This package provides a flexible framework for comparing algorithms and trees through cross-validation. In the present study, 10-fold cross-validation was used. In each of a series of 10 analyses, 90% of the data constituted the "training set," which was used to build the decision tree, and the remaining 10% constituted the "testing set" (also known as the "prediction set"), which was used to evaluate the predictive ability of the resulting tree.
To aid interpretation of the decision trees developed herein, the records of 2 example cows were classified using the decision tree for first-service conception rate. Furthermore, the predicted probability of pregnancy by 150 DIM was computed by using logistic regression for each continuous variable (LOGIST procedure; SAS Inst. Inc., Cary, NC) or a generalized linear model for each discrete variable (CATMOD procedure; SAS Inst., Inc.) that was included in the corresponding decision tree. The purpose of the logistic regression analysis was to aid interpretation of the effects of variables chosen by the decision tree. Many of these analyses corresponded to decision nodes that were nested within (i.e., were children of) prediction nodes located at higher levels of the tree, and the probabilities were computed using only those records that reached the corresponding decision node (i.e., records for which variables at ancestor nodes satisfied the conditions of the corresponding cut points).
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
|
ln[0.252/(1 0.252)], was 0.545, indicating a greater chance for failure than conception at first service.
As shown in Table 1
, a total of 4,048 insemination records were from cows residing in herds for which maintenance hoof trimming occurred less than once annually (i.e., response category 1). At decision node 1, prediction node "a" assigns a value of 0.597 to records pertaining to category 1 (hoof trimming less than once annually), indicating a reduced probability of conception at first service. The path below prediction node "b" for this variable is followed by all other records that do not have missing values for frequency of maintenance hoof trimming. Lameness is a common problem on free-stall operations in North America (Cook, 2003), and it has been shown to delay the onset of ovarian activity (Garbarino et al., 2004) and depress the conception rate (Collick et al., 1989). Regular hoof trimming has been shown to be a critical component of lameness prevention in dairy herds (Manske et al., 2002).
Several farms move their cows out of the close-up dry cow pen, which had mean entry and exit times of 20 d prepartum and 3 d prepartum, respectively, later than 1.5 d before calving, as shown by decision node 2 in Table 1
. Cows on these farms had a lesser probability of conception at first service, as indicated by a value of0.602 at prediction node "b." A possible explanation is the relationship between housing or bedding and exposure to pathogens near parturition. Several studies have emphasized the importance of proper management of peripartum cows, particularly with regard to avoiding postpartum endometritis (Loeffler et al., 1999; LeBlanc et al., 2002; Sheldon et al., 2002). In a very large study, LeBlanc et al. (2002) showed that cows having postpartum endometritis had a lesser mean conception rate at first service (29.8%) than healthy cows (37.9%). The corresponding relative risk of pregnancy in infected vs. healthy cows (risk of 1.00), after correcting for environmental factors, was 0.69. In addition, Sheldon et al. (2002) reported a decline from 35 to 5% in dominant follicle selection on the ipsilateral ovary, with respect to the previously gravid uterine horn, in cows with elevated early postpartum uterine bacterial counts. Bacterial contamination also was shown to inhibit the growth and function of the dominant follicle, leading to a smaller, slower-developing dominant follicle and a corresponding reduction in the production of estradiol. Loeffler et al. (1999) emphasized the importance of maintaining high peripartum feed intake, as well as the importance of avoiding displaced abomasum, cystic ovaries, metritis, mastitis, or lameness within 30 d of an insemination.
Decision node 3, which is based on the weight of milk shipped per cow per day, had a very low cut point (shown in Table 1
), because only 457 records corresponded to cows on farms that shipped less than 30 kg/cow per d. Therefore, very few records were affected by the prediction node "a" value of 0.813. Decision nodes 9 (number of inseminations until a culling decision is made), 10 (proportion of heifers sent to a custom grower), 12 (difficulty in keeping good farm employees), 16 (kg of milk shipped/cow per d), 20 (DMI for nonpregnant cows), and 22 (bunk space/cow in the breeding pens) each had relatively extreme cut points as well, such that only 505, 245, 488, 273, 569, and 352 records, respectively, reached one of the corresponding prediction nodes. Therefore, despite being important contributors to the predictive ability of this tree, these nodes provide little insight into possible relationships between the corresponding explanatory variables and reproductive performance, and one should not attempt to draw conclusions based on data from 1 or 2 specific farms. This illustrates a common challenge with the interpretation of decision trees. The presence of nodes that are reached by few highly influential records tends to hamper user interpretation. Preferential expansion of heavy branches in the tree tends to limit the generation of these nodes with extreme cut points to the cases in which very significant gains in predictive ability are obtained.
Decision node 4 corresponds to the type of bedding material in the far-off dry cow pen, which had mean entry and exit times of 53 and 21 d prepartum, respectively. As shown in Table 1
, a total of 1,783 cows were located on farms that used sawdust, and as indicated by the value of 0.544 at prediction node "a," these cows had a much smaller probability of conception at first service than cows housed on sand or mattresses. This result most likely reflects a tendency for a greater incidence of disorders such as metritis or mastitis on farms that use organic bedding, because such infections subsequently impair reproductive performance (Barker et al., 1998). A large proportion of clinical mastitis cases become apparent during the first 2 wk of lactation (Goff and Horst, 1997), and the rate of infection during this period is heavily influenced by dry cow management practices. Zdanowicz et al. (2004) reported that sawdust bedding contains more bacteria than sand bedding, and that cows housed on sawdust had twice as many coliform bacteria and 6 times as many Klebsiella bacteria on the teat ends as cows housed on sand.
As indicated by decision node 5, herds in which a veterinarian was responsible for formulating the ration had poorer first-service conception rates, with a prediction node "a" value of 0.517. This prediction node was reached by only 1,276 cows on 5 farms, however, and one should not draw conclusions about the ability of veterinarians to formulate diets based on such limited data. Decision node 6 indicates a smaller probability of conception, as noted by a prediction node "a" value of 0.29, among cows that were bred in cow-handling chutes, compared with those inseminated using other types of restraints, although the explanation for this relationship is not obvious. Decision node 7 indicates that herds with a VWP of <41 d had a smaller probability of conception (prediction node "a" value of 0.304) than herds with a longer VWP. Note that node 5 was reached only by records from herds that do not use sawdust bedding in the far-off dry cow pen, and that nodes 6 and 7 were reached only by the subset of these herds in which diets were formulated by someone other than a veterinarian.
Decision node 8 indicates that herds with fewer than 35 stalls in the far-off dry cow pen have greater conception rates (prediction node "a" value of 0.273). Decision node 11, which is based on the producers degree of satisfaction with the health of replacement heifers, illustrates the challenge of proper and informative coding of response categories. Prediction node "a," with a corresponding value of 0.06, indicates a slight advantage in conception rate on farms that are reasonably satisfied (score of 4 on a 5-point scale) with the health of their heifers. It is not obvious why the 45 herds that were extremely satisfied (score of 5) did not realize a similar gain in first-service conception rate. In practice, it may be more useful to treat ordinal variables, such as these, in a continuous rather than a categorical manner.
Decision node 13, with a prediction node "a" value of 0.064, seems to indicate that farms that purchased few (<8) or no replacements in 2003 had a greater probability of conception. This could be taken as an indicator that biosecurity and reproductive performance are related, but such a conclusion seems contradictory to the outcome of decision node 15, which assigns a prediction node "b" value of 0.098 to herds that purchased
45 replacements in 2001. Numerous authors have speculated about the impact of biosecurity, herd expansion, and herd size on reproductive performance (Bagnato and Oltenacu, 1994; Stahl et al., 1999; Lucy, 2001). Stahl et al. (1999) noted that an increase in the number of cows is not typically followed by a proportional increase in the number of employees. Pursley et al. (1997) noted an increased interest in hormonal synchronization and timed AI programs among larger herds.
The prediction node "a" value at decision node 14 indicates a greater probability of conception at first service among cows that had daily milk yields of <41 kg on the test date closest to insemination. The impact of milk production on fertility has been discussed in numerous studies (Loeffler et al., 1999; Westwood et al., 2002; Windig et al., 2005). In an association study, Washburn et al. (2002) examined the historical relationship between milk yield and conception rate, noting that average 4% FCM yield increased from 6,400 kg/cow per yr to 7,800 kg/cow per yr from the mid-1970s to the mid-1990s, whereas the average conception rate declined from 55 to 35% during the same period. Others (Bagnato and Oltenacu, 1994; Loeffler et al., 1999; Windig et al., 2005) reported that the number of inseminations and the conception rate at first service were antagonistically correlated with milk production. Gröhn and Rajala-Schultz (2000) reported an odds ratio of 0.92 for conception in cows producing >2,541 kg cumulative milk yield during the first 60 DIM, compared with cows with a lesser cumulative yield.
Decision nodes 17 (depth of concrete grooves in the breeding pen), 18 (number of stalls in the close-up dry cow pen), and 19 (number of cows in the close-up dry cow pen) appear in the first level of the tree (i.e., they are not nested within any ancestral nodes) and are related to facilities and housing. Results seem to indicate a greater probability of conception in herds with flooring grooves <2.2 cm deep in the breeding pen, and herds with
17 stalls or <9 cows in the close-up dry cow pen.
Last, decision node 21 was based on the maximum temperature at the nearest weather station on the day of AI. The prediction node "a" value of 0.022 indicates greater probability of conception at temperatures <30.8°C. Several authors (Wilson et al., 1998; Ravagnolo and Misztal, 2002) have addressed the impact of heat stress on reproductive performance, and have noted that the temperaturehumidity index (THI) on the day of AI had the largest effect on the 45-d nonreturn rate, followed by the THI 2 d before, 5 d before, and 5 d after AI. Ravagnolo and Misztal (2002) also confirmed a steep reduction in the 45-d nonreturn rate (about 0.5% per unit increase in THI) for THI values >68 and noted that primiparous cows were slightly more susceptible to increases in the THI than multiparous cows. Although the THI is readily available from weather stations, these may be not accurately reflect heat stress on individual farms, because the use of fans and sprinklers can greatly reduce the correlation between THI within the barn and THI at the nearest weather station.
Table 2
illustrates the methodology for predicting the outcome of first service for 2 example cows using the alternating decision tree in Figure 1
. As noted earlier, the root node is negative (0.545), because the average first-service conception rate was 25.2%. Both cows receive a contribution of 0.061 to the final sum from node 1, because neither comes from a herd that practices maintenance hoof trimming less than once annually. Based on differences in the time at which cows exit from the close-up dry cow pen on the corresponding farms, cow A gets a contribution of 0.147 from node 2, whereas cow B gets a contribution of 0.602. Both cows come from farms that ship
30 kg of milk/cow per d, so both receive a prediction value of 0.021 from node 3. Likewise, neither farm uses sawdust to bed the far-off dry cow pen or a veterinarian to formulate the diet, so both receive contributions of 0.009 and 0.018 from nodes 4 and 5, respectively. Because cow 1 comes from a farm that uses a cow-handling chute, she receives a prediction value of 0.29 from node 6, but she does not reach decision node 7, because she does not satisfy the necessary condition (category = 3) at the ancestral prediction node 6. Cow 2, on the other hand, gets contributions to the final sum of 0.025 from nodes 6 and 7. This process continues as each record passes through every branch and every node in the decision tree, with the exception of nodes that cannot be reached (e.g., nodes 12, 16, 21, and 22 for cow 2) and nodes corresponding to missing data points (e.g., node 18 for cow 1). Next, the prediction values are summed across nodes for each cow. Because a threshold of zero was used, the final sums of 0.131 and 2.251 lead to classifications (i.e., predicted outcomes) of "pregnant" for cow 1 and "nonpregnant" for cow 2. Note that the absolute value of the final sum is much greater for cow 2 (2.251) than for cow 1 (0.131), indicating a larger classification margin for cow 2 and a greater degree of confidence in her predicted nonpregnant outcome (compared with the predicted pregnant outcome for cow 1).
|
|
|
|
36 cm of bunk space per cow), corresponded to semen-handling techniques, specifically the temperature at which straws of frozen semen were thawed before AI. The alternating decision tree algorithm selected a temperature of 34.7°C as the cut point, and among herds that had >36 cm of bunk space per cow, the subset of herds that routinely thawed semen at temperatures greater than 34.7°C had a lesser probability of pregnancy by 150 DIM (prediction node "b" value of 0.957). The relationship between semen thawing temperature and probability of pregnancy decreased curvilinearly (Figure 3
34.7°C) provided information about semen thawing temperature in the herd management survey. Others (Kaproth et al., 2002) have addressed the importance of thawing temperature and technique on subsequent conception rates.
Node 3, which was also a child of prediction node "b" at node 1, corresponded to the percentage of BCS faults in each herd. A fault was assigned for each individual cow whose BCS was below a predetermined threshold for a given number of days postpartum at the time of measurement (Caraviello, 2005). These BCS thresholds were 3.0, 2.5, and 2.75 for cows that were 60 to 30, 31 to 180, and >180 d postpartum, respectively. Among the 63 herds in the Upper Midwest that were evaluated for BCS from May to August, 2004, herds in which
54.5% of cows had BCS faults generated a prediction node "b" value of 0.288, indicating a smaller probability of pregnancy by 150 DIM. Figure 3
also shows a negative relationship between the percentage of BCS faults and the probability of pregnancy by 150 DIM, especially when the percentage of BCS faults is large. Although the actual BCS of a cow being inseminated would be more informative than the percentage of BCS faults in the herd in which she resides, the latter can be scored quickly and efficiently in a single (perhaps annual) visit to the farm. Studies (Royal et al., 2002) have evaluated the impact of BCS on fertility, and it is well known that cows with lesser BCS in early lactation or that have larger BCS changes during lactation have impaired reproductive performance. Larger BCS are genetically correlated with improved reproductive performance (Dechow et al., 2001), whereas smaller BCS are associated with increased time from parturition to onset of ovarian activity (Royal et al., 2002).
Node 4, which reflected the maximum number of cows per maternity pen (mean entry and exit times for maternity pens were 2 d prepartum and 1 d postpartum, respectively), was also a child of prediction node "b" at node 1. The value of prediction node "b" at node 4, 0.841, indicates a smaller probability of pregnancy by 150 DIM for cows in herds that allowed >22.5 cows per maternity pen, but only 2 herds in this branch of the tree (428 total cows) allowed such high stocking of maternity pens. As shown in Figure 3
, the probability of pregnancy by 150 DIM appeared to decrease more or less linearly as the maximum number of cows per maternity pen increased from roughly 10 to 25. Weigel et al. (2003) reported an association between involuntary culling rates and the presence or absence of individual maternity pens. Grant and Albright (2001) noted that overcrowding, particularly in the absence of sufficient bunk space, can affect cow behavior and feed intake.
Node 5 identified the number of failed services after which a cow is moved to the bull pen for natural service (clean-up) mating as a potential indicator of reproductive efficiency. The cut point for this variable was 5.5 inseminations, and as shown in Figure 3
, the probability of pregnancy by 150 DIM seemed to decrease as the number of failed services increased. It is important to note, however, that this variable was only applicable to herds that routinely used a clean-up bull, and herds that used 100% AI did not reach this decision node (the data included 6 herds with <5.5 services and 11 herds with
5.5 services). Also note that both prediction nodes were negative, which indicates poorer reproductive performance in herds that use clean-up bulls, compared with herds using 100% AI. Overton (2005) reported that natural service mating costs approximately $10 more per cow than when using an AI program. De Vries et al. (2005b) reported slightly greater pregnancy rates among natural service and mixed AI and natural service herds, compared with 100% AI herds, during the summer, although pregnancy rates for all 3 groups were similar during the winter. Furthermore, natural service and mixed herds sacrificed 1,333 and 337 kg of rolling herd average milk, respectively. Smith et al. (2004) noted that herds that used a combination of AI and natural service had longer calving intervals and more days dry than herds that used 100% AI.
Nodes 6, 9, and 15 were all functions of daily milk yield of a specific cow on (or near) the date of first insemination. Because all 3 of these decision nodes were located at the main level of the tree (i.e., they were children of the root node), the nodes were combined into 1 logistic regression (Figure 3
). Furthermore, because of the tendency of the decision tree algorithm to isolate individual cows with extreme daily milk yields, this variable was categorized into 6 levels. The cut points for decision nodes 6 and 9 were <30.8 and >51.3 kg/d, respectively, whereas decision node 15 isolated the category of cows with daily milk yield between 35.9 and 40.3 kg/d. As indicated by the negative value for prediction node "a" at decision node 6, cows with daily production <30.8 kg/d seemed to have a smaller probability of pregnancy by 150 DIM than their contemporaries, and the poorer production of these cows might indicate the presence of clinical or subclinical illness. On the other hand, the prediction node "b" value at decision node 9 was also negative, indicating a reduced probability of pregnancy by 150 DIM for cows with daily milk yield >51.3 kg/d. As shown in Figure 3
, the greatest probability of pregnancy occurred among cows milking from 35.9 to 40.3 kg/d, and this category was isolated by decision node 15. Nodes 17 and 22 also differentiated cows based on milk yield at first AI, but these decision nodes were located deep within the tree, as children of node 16 and nodes 16 and 21, respectively. The flexibility of the alternating decision tree algorithm is demonstrated by its ability to include a single explanatory variable, such as milk yield at first service, multiple times at different levels of the tree and with different cut points. Westwood et al. (2002) observed that cows producing more than 38 kg/d were 2.6 times more likely to ovulate later in lactation than cows producing less than 29 kg/d. They also observed that cows whose first ovulation occurred after 53 DIM had a 1.6-fold greater risk of an increased interval from calving to first breeding when compared with cows that ovulated before 21 DIM. Similarly, Weigel (2004) showed that primiparous cows with milk production >36 kg/d and multiparous cows with milk production >45 kg/d had 1.8 and 1.6% lower conception rates, respectively, when compared with other cows of the same age.
Node 7, which reflects the type of manure removal system, was also child of prediction node "b" at decision node 1. As indicated by the negative prediction node value in Table 3
, as well as the relationships shown in Figure 4
, herds that used a slatted floor for manure removal appeared to have fewer pregnant cows by 150 DIM. It is important to note that very few herds were represented in this category. Bewley et al. (2001) reviewed the costs and benefits of various barn designs, manure removal systems, and related management issues.
|
Nodes 10 and 12 reflected the type of maternity housing provided for cows. Node 10 was a child of node 1 (bunk space per cow) and node 12 was a child of node 11 (amount of daily milk yield at which nonpregnant cows are destined to be culled). Node 10 indicated that herds with a bedded pack had greater reproductive efficiency, compared with herds that used 2-row head-to-head free stalls or other housing systems. Node 12 further isolated herds in the "other" category (i.e., not a bedded pack or 2-row head-to-head free stalls) as having smaller percentages of pregnant cows by 150 DIM. Relationships corresponding to these nodes are similar, as shown in Figure 4
, although the effect of the "other" category was not estimable in the generalized linear model analysis because of few herds (and records) reaching prediction node "a" of decision node 12.
Decision node 11 was a function of the amount of daily milk production at which each herd manager decides to cull nonpregnant cows. Although the value of 0.672 at prediction node "a" reflected data from only 1 herd, the relationship shown in Figure 4
shows that the percentage of cows pregnant by 150 DIM appeared to be greater in herds that had more stringent culling criteria for poorer-producing, nonpregnant cows. This is not surprising, because herds with greater overall reproductive efficiency tend to have shorter average DIM and a larger supply of replacement heifers, so they can afford to have more stringent culling criteria.
Node 13, which was based on the time of day when cows are typically inseminated, indicated a slight advantage in reproductive efficiency among herds (n = 22) that bred cows before 7:30 a.m. than herds (n = 32) that bred cows later in the day. As indicated by the flat logistic regression curve (Figure 4
), the cut point of 7:30 a.m. assigned by the alternating decision tree algorithm may reflect differences in other management or environmental characteristics that were confounded with breeding time, rather than a direct effect of this variable.
Decision node 14, which reflects the presence or absence of cooling fans, must be interpreted cautiously. The prediction node "a" value of 0.155 seems to imply an advantage in reproductive efficiency among herds that do not have fans. A more likely interpretation is that the percentage of cows pregnant by 150 DIM is greater in herds located in cool climates, which in turn do not need cooling fans.
Node 16 was based on the type of barn used to house close-up dry cows. Herds that used a 2- or 3-row free-stall barn or a bedded pack appeared to have greater reproductive efficiency than herds that provided some other type of housing system, although relatively few herds were represented in the latter category. As shown in Figure 5
, the probability of pregnancy by 150 DIM was greatest in herds with a 2-row head-to-head free-stall barn or a bedded pack.
|
Node 20, which was a function of maximum outside temperature on the day of AI at the nearest weather station, had an extremely low cut point of 1.4°C. Figure 5
indicates a weak relationship between weather station temperatures and the probability of pregnancy by 150 DIM. The temperature or THI at a nearby weather station may be a poor indicator of actual conditions within the barn, because the latter can be influenced quite heavily by sprinklers, cooling fans, and other heat abatement devices. Numerous studies have documented the detrimental effects of heat stress on fertility in lactating dairy cows, including decreased feed intake, reduced physical activity, diminished expression of estrus, compromised conception rates, and increased embryonic deaths (Sartori et al., 2002). Heat-stressed cows have a reduced proestrous rise in estrogen, smaller size of the second-wave dominant follicle, greater numbers of follicular waves per estrous cycle, and longer luteal phases (Wilson et al., 1998).
Ability to keep good employees, which was scored on a 5-point ordinal scale, was selected as node 21. Although only 2 herds offered survey replies of "1," indicating that keeping good employees was very difficult, these herds had significantly impaired reproductive efficiency (prediction node "a" value of 0.782). The generalized linear model solution corresponding to category 1 was not estimable because of confounding and lack of information. Employee training was previously identified by Sanders (2005) as an essential component of the reproductive management program on commercial dairy farms.
Node 23, which was a child of nodes 16 (type of housing for close-up dry cows), 21 (ability to keep good employees), and 22 (milk yield at first insemination), reflected the postpartum waiting period before administration of bST. The prediction node "a" value of 0.36, corresponding to herds that initiated bST biweekly injections before 58.5 DIM, seems to contradict the logistic regression in Figure 6
. It is important to note that relatively few cows (346 cows in 2 herds) reached this prediction node, as the cut point of 58.5 d was extremely low. Santos et al. (2004) reported a positive impact of bST on first-service conception rate and reduced early embryonic loss. Silvia et al. (2002) noted that the time of onset of bST did not affect the percentages of multiparous cows pregnant by 150, 200, or 250 DIM, although later supplementation of bST led to slight increases in the percentages of primiparous cows pregnant at 200 and 250 DIM. Judge et al. (1999) noted that administration of bST at 9 wk postpartum had no effect on the percentage of cows remaining nonpregnant at 150 DIM.
|
98.5 cows per pen had a smaller percentage of cows pregnant by 150 DIM than 53 herds with fewer cows per pen, as indicated by the prediction node "b" value of 0.736. Grant and Albright (2001) noted that large groups (>200 cows) are not a problem per se, but overcrowding of pens can reduce feeding activity, alter resting behavior, and decrease rumination.
Node 25, which was nested within nodes 16 (type of barn used to house close-up dry cows), 21 (ability to keep good employees), and 22 (milk yield at first insemination), was a function of the average number of cows in the close-up dry cow pen. The prediction node "a" value of 0.114 seems to contradict the logistic regression shown in Figure 6
, although it is once again important to recognize that this decision node was nested deep within the tree.
Despite the ability of the alternating decision tree algorithm to provide relatively readable trees, some of the interactions modeled by this algorithm are complex and cannot be explained readily. When evaluating individual nodes located at deeper levels within the tree, or nodes that entered the tree at later iterations, it is important to recognize that weights corresponding to the records at these nodes may be different from 1, and that several complex interactions (especially those involving ancestral nodes) may influence the values of specific prediction nodes. In this study, the number of nodes that maximized the predictive ability of the decision trees for first-service conception rate and pregnancy status at 150 DIM was 22 and 25, respectively, and this was determined via 10-fold cross-validation in numerous preliminary analyses (Caraviello, 2005). On the other hand, it is likely that a number of larger or smaller decision trees with similar or only slightly poorer predictive ability could have been constructed. Furthermore, it is likely that alternative, correlated variables (either variables that were not measured in this study, or variables that were included in this study but were not selected for the decision trees) could have served as suitable substitutes for the variables identified in the decision trees presented herein. Therefore, decisions regarding the selection of variables to include in decision trees in practice should consider not only the predictive ability of each variable, but also the cost of recording each variable in an experimental or commercial setting.
| CONCLUSIONS |
|---|
|
|
|---|
Future studies can use similar machine-learning methods to analyze a smaller, more focused set of explanatory variables than the large field data set examined herein. In addition, when the main goal is to understand interactions between explanatory variables instead of building the best possible predictive model, consideration should be given to coding strategies that can minimize the incidence of extreme cut points that tend to isolate records from a few selected cows or herds. Thresholds different from zero have the potential to meet different needs (e.g., in situations in which a false-positive error is more costly than a false-negative error) and should be considered in future studies through further exploring receiver operating characteristic curves and cost-sensitive classification. Furthermore, classification of records into multiple categories based on confidence level (i.e., the classification margin) could prove interesting in field applications; for example, one might be able to identify cows for which a particular management intervention is warranted.
This alternating decision tree algorithm can be implemented using public domain software, and because of its computational efficiency, it is suitable for routine field applications by veterinarians, nutritionists, or reproductive consultants using a standard laptop computer. Overall, it seems that advances in computing power, coupled with the availability of efficient, flexible, and user-friendly software, will greatly foster the application of machine learning algorithms in future analyses of reproductive performance and other aspects of dairy herd management in the future.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Received for publication August 8, 2005. Accepted for publication June 19, 2006.
| REFERENCES |
|---|
|
|
|---|