JDS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J. Dairy Sci. 2008. 91:2823-2835. doi:10.3168/jds.2007-0946
© 2008 American Dairy Science Association ®

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow __Erratum__
Right arrow An erratum has been published
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kuhn, M. T.
Right arrow Articles by Norman, H. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kuhn, M. T.
Right arrow Articles by Norman, H. D.

Modeling Nuisance Variables for Prediction of Service Sire Fertility

M. T. Kuhn1, J. L. Hutchison and H. D. Norman

Animal Improvement Programs Laboratory Agricultural Research Service, USDA, Beltsville, MD 20705-2350

1 Corresponding author: Melvin.Kuhn{at}ars.usda.gov


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
The purpose of this research was to determine which (available) nuisance variables should be included in a model for phenotypic evaluation of US service sire conception rate (CR), based on field data. Alternative models were compared by splitting data into records for estimation and set-aside records, computing predictions using the estimation data, and then comparing predictions to bulls’ average CR in the set-aside data. Breedings for estimation were from January 1, 2003, to June 30, 2005, and set-aside records spanned July 1, 2005, to June 30, 2006. Only matings with known outcomes were included in either data set. Correlations and mean differences were the main statistics used to compare models. Nuisance variables considered were management groups based on herd-year-season-parity-registry (HYSPR) classes, year-state-month, cow age, DIM, and various combinations of lactation, service number, and milk yield. Preliminary analyses led to 1) selection of standardized lactational milk yield as the production variable for consideration and 2) modeling quantitative independent variables as categorical factors rather than linear and quadratic covariates. Two general strategies for management groups were tested, one where HYSPR groups were required to have an absolute specified minimum number of records and a second where groups were combined across registry, season, and parity subclasses until a minimum group size was achieved. Combining groups to a target size of 20 and including a herd-year into the evaluation provided it had a minimum of 10 breedings maximized correlation with future year CR and was chosen as the management grouping strategy for implementation. Combining groups implied that some groups had multiple seasons as well as parities, which was the reason for consideration of year-state-month and lactation as additional factors. The final nuisance variables selected for inclusion in the model for prediction of service sire CR were, in addition to HYSPR, year-state-month, lactation, service number, milk yield, cow age at breeding, an interval between breedings variable to account for lower CR following short estrus cycles, and the cow effect, partitioned as permanent environment and breeding value. This model maximized correlation with future year CR (55.14%), minimized mean square error (3.255), and had a mean difference of essentially 0.

Key Words: bull fertility • conception rate • prediction


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
In May 2006, Animal Improvement Programs Laboratory (AIPL) assumed responsibility for phenotypic evaluation of dairy bull fertility in the United States, based on field data collected through DHI. As an initial step, AIPL implemented the estimated relative conception rate evaluation that had been previously computed by Dairy Records Management Systems (Raleigh, NC). The current objective at AIPL is to ascertain how evaluations might be improved through trait definition and statistical modeling.

Previous research (Kuhn et al., 2004) has shown that use of multiple services and use of an "expanded" service sire term both improve accuracy of evaluation, simply by increasing the amount of information used to evaluate each bull. The expanded service sire term involves fitting factors related to bull fertility separately in the model and then computing a bull’s evaluation as the sum of the solutions for each factor. Factors in the expanded service sire term (i.e., contributory or component factors) included bull age, stud-year of insemination, inbreeding of the mating, and the bull’s own inbreeding coefficient; a residual service sire term, which is unique to the individual bull, is included in the model and in the predictor as well. Use of the expanded service sire term does not change the interpretation of the bull fertility evaluation, relative to existing evaluations such as estimated relative conception rate or the Agri-Tech Analytics Service Sire Fertility Summary (described by Weigel, 2004). It is still a phenotypic prediction (evaluation) of the bull’s conception rate (CR), not a genetic evaluation. This is emphasized by the fact that a) heritability of bull CR in AI has been estimated at 0 and b) the factors included in the expanded service sire term are environmental factors. The only reason for using an expanded service sire term is because it improves efficiency of estimation and thus accuracy of evaluation. Age, for example, affects a bull’s fertility level; if 20 young bulls each have 300 matings, then the age effect is, in effect, estimated 20 separate times using only 300 matings each time when age is not explicitly modeled (i.e., when bull only is included in the model). With the expanded service sire term, all 6,000 breedings contribute to the estimate of the age effect and thus that component is estimated more efficiently, and therefore the evaluation is more accurate when that component is added back to the bull’s evaluation. Thus, the expanded service sire term is used only to improve accuracy of evaluation, not to change the interpretation of the evaluation.

Research to date, then, has focused primarily on trait definition and on modeling the service sire component, but numerous other factors also affect CR. These nuisance variables need to be accounted for as well, to the extent possible, to maximize accuracy and reduce or eliminate bias in evaluations. The objective of this research was to determine which factors (nuisance variables) to include and how they can best, or at least adequately, be modeled. Factors considered included 1) management group definition based on herd, year, season, parity, and registry status, 2) milk yield, 3) cow age, 4) DIM at breeding, 5) lactation number, 6) service number, 7) an interval between breedings variable to account for lower CR following short cycles, and 8) cow effects, both genetic and permanent environmental. The effects of these factors on CR are generally well documented. Numerous studies (e.g., Gwazdauskas et al., 1975; Hillers et al., 1984; Ron et al., 1984; Taylor et al., 1985; Reurink et al., 1990; Van Doormaal, 1993; Stålhammar et al., 1994; Al-Katanani et al., 1999; Ravagnolo and Misztal, 2002; García-Ispierto et al., 2007) have shown effects of herd, DIM or service number, parity, season, milk yield, and cow age on CR.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Data and Methods for Comparison
Alternative models were compared by splitting available data into records for estimation and set-aside data and then comparing evaluations based on the estimation data to bulls’ mean CR in the set-aside data. Estimation data included breedings from January 1, 2003, to June 30, 2005, and set-aside records included breedings from July 1, 2005, to June 30, 2006. Correlations, mean differences, and square roots of mean square errors (standard deviations of differences) between the evaluation and set-aside average CR were used to compare models.

Another alternative for splitting data into estimation and set-aside records would have been to split the data in half by herd, but such an approach complicates calculation of the predictor because the predictor contains variables related to time. If data were split by herd so that each half of the data contained several years of records, many bulls would have breedings in each year in the set-aside data, and thus the bull age to use in the predictor is unclear, as is the appropriate stud-year solution. Furthermore, use of a future year as set-aside data is more consistent with the purpose of the predictor; the goal is not to predict CR that occurred simultaneously in time but rather bulls’ future CR.

Only AI breedings of Holstein cows were included in this research (virgin heifers excluded). In addition to excluding natural services, known sexed semen matings were also excluded. A maximum of 7 services per lactation were utilized, and lactations beyond fifth were also excluded. Another edit was to eliminate matings on the same cow that occurred close in time. When 2 matings occurred less than 10 d apart, only the later breeding was kept. Presumably, repeat matings within short time periods are often the result of misdiagnosed heats on the first insemination or perhaps the animal was bred on a timed AI program and was later observed in heat.

A minimum of 4,536 kg for standardized (305-d 2x ME) milk yield was imposed, and DIM at breeding was required to be between 30 and 365 d. Only breedings with known outcomes were included; all available information was used to determine outcomes including subsequent calving dates, subsequent breedings, pregnancy exams, termination codes, and do-not-breed designations reported by the farmer. Conflicting information on the outcome of a mating is rare, but if it does occur, the sources of information are listed basically in order of precedence. For example, if a pregnancy exam reported the cow open but a subsequent calving date indicated the breeding was a success, then the mating was coded as a success. Similarly, if a positive pregnancy exam is followed by subsequent breedings, the mating is considered a failure.

Herd-year CR was required to be between 10 and 90% to eliminate, among other possible anomalies, herds that report only successful breedings. Herds were also required to have a reported breeding on at least 50% of their milking cows, which was done to eliminate herds with scant, and perhaps erroneous, reporting; some herds, for example, may have a milking cow herd size of 200 or more and have only 2 or 3 breedings reported. After edits, there were 3,613,907 breedings for 1,231,184 cows in 13,416 herds distributed across 45 states in the estimation data. In the set-aside data, there were 2,025,884 records, 965,748 cows, and 10,692 herds. Average CR was 30%. There were 14,084 service sires in the estimation data and 10,288 in the set-aside data. Bulls, however, were required to have a minimum of 50 matings for estimation and 100 matings and 30 herds in the future year to be included for comparisons (correlations, etc.), which resulted in 803 bulls for comparison.

Management Groups
The basic definition for management group that was considered was herd-year-season-parity-registry status (HYSPR) of breeding, where seasons were 2-mo groups starting with January, parity was first vs. second and later lactations, and registry status had 2 classes: 100% registered or not registered; this is the same management group definition utilized by AIPL for genetic evaluation of production traits (Wiggans et al., 1988). For distinction between parity groups and individual lactations, the term parity will be used throughout this paper to refer to the parity groups just defined (1st vs. 2nd and later lactations), whereas the term lactation will be used to refer to individual lactation numbers (1, 2,...,5).

Use of HYSPR resulted in a large number of small groups (Table 1Go). There were 122,475 groups, for example, that had only 1 record. Two general approaches were investigated to offset small management group size. One was to simply require that all HYSPR groups have an absolute minimum number of records to be included; minimums tested were 3, 5, 10, and 20. The second approach, which is utilized in the US production evaluations, was to combine groups until a specified minimum within a herd-year was obtained; minimums were 3, 5, 10, and 20. Combining groups has the potential to avoid both the loss in accuracy due to small group sizes and loss due to discarding data.


View this table:
[in this window]
[in a new window]

 
Table 1. Frequency of herd-year-season-parity-registry (HYSPR) group sizes before combining
 
Combining groups was guided partly by the preliminary results shown in Table 2Go, which were based on a 10% sample of the data and a fixed effects model containing only the main effects of herd, year, season, parity, and registry class. As indicated by the sums of squares, herd accounted for the most variation in CR. Whereas season accounted for a larger portion of the variation than parity, the means in Table 2Go indicate that the primary difference among seasons was the July–August season vs. all others. Because registry status had the smallest effect on CR, groups were first combined across registry classes, then seasons, and finally across parities. The only combining across year that was allowed was to combine the January–February season with the November–December season from the previous year. Records were never combined across herds. Seasons were combined sequentially; for example, if the minimum required was 10 and season 1 (January–February) had 3 breedings, season 2 had 3, and season 3 had 5, then the first 3 seasons were combined to form a group.


View this table:
[in this window]
[in a new window]

 
Table 2. Conception rate least-squares means (LSM) and sums of squares1 for factors included as part of management group definition
 
When combining groups, the specified minimum may not be met even when groups have been combined down to the herd-year level. For example, a herd may have only 19 breedings for a given year in which case those breedings would be discarded when using a 20-record minimum. Another alternative would be to specify a second minimum, which would be the minimum number of breedings for a herd-year and allow the herd-year into the evaluation provided the second minimum was satisfied; in this case, the herd with only 19 breedings would still be utilized in the evaluation. This alternative was tested for the 5, 10, and 20 minimums. For minimum group sizes of 5, a herd-year with at least 2 breedings was allowed into the evaluation; the lower limits for combining to group sizes of 10 and 20 were 5 and 10, respectively.

In total, 11 options were tested for management group formation: a) 4 options where all groups were truly HYSPR with minimum group sizes of 3, 5, 10, or 20; b) 4 options using combined groups where combined groups were required to have an absolute minimum of 3, 5, 10, or 20 records; and c) 3 combined options where the goal was to combine to a given group size but the herd-year was allowed in with a certain minimum number of breedings, as described previously.

Previous research has shown that linear models perform as well as threshold models for phenotypic evaluation of bull fertility in the United States (Kuhn and Hutchison, 2006). Thus, all analyses in this research utilized linear models only. Predictions to compare management group alternatives were obtained using the following model:


Formula[1]

where y was the binary outcome of the mating (0 = failure, 1 = success), HYSPR was herd-year-season-parity-registry status as defined above, milk was 305-d 2x ME milk yield, Agecow was the cow’s age at breeding, DIMb was DIM at breeding, Fbull was the bull’s own inbreeding coefficient expressed as deviation from the overall mean, Fmating was the inbreeding of the mating (potential embryo) expressed as deviation from the overall mean, β1 through β8 were regression coefficients, Agebull was a categorical variable for service sire’s age at mating, Stud-Year was the effect of the AI organization of the bull for the year of mating, Bull was the (residual) service sire effect, A was the additive genetic effect of the cow, PE was the permanent environmental effect of the cow (mate), and e was random error. The effects of PE and A were common to all breedings of a cow, whereas (lactational) milk yield was common to all breedings within a lactation; the other nuisance variables (HYSPR, Agecow, DIMb) were specific to individual breedings, although a cow could, of course, have more than one breeding in a single HYSPR. The 3 categorical variables related to the service sire (Agebull, Stud-year, Bull) along with the cow effects (PE, A) were fit as random, and all other terms were fit as fixed effects. There were 12 categories for service sire age, formed by rounding the bull’s age in years to the nearest whole number and including all bulls older than 12 yr at the time of mating in group 12. Previous research (Kuhn et al., 2004) has shown that fitting categorical variables in the expanded service sire term as fixed results in substantial bias, whereas fitting them as random resulted in improved accuracy with no bias; hence, Agebull and Stud-year were fit as random rather than fixed effects.

The variance-covariance matrix for breeding value of the cow (mate) was A{sigma}a2, where A was the usual additive relationship matrix for an animal model and {sigma}a2 was the scalar additive genetic variance for cow breeding value for CR. The variance-covariance matrices for the remaining random effects (Agebull, Stud-year, Bull, PE, and error) were all of the form IFormula where I was an identity matrix and {sigma}i2 was the scalar variance for the corresponding effect. Previous research (Kuhn et al., 2004) estimated the additive genetic variance for the service sire residual to be 0, and therefore, an additive genetic effect for service sires was not included. The scalar variances for random effects, estimated from previous research (Kuhn et al., 2004) using model [1] with herd-year and year-state-month in place of HYSPR, were 0.00014 for stud-year, 0.00011 for bull age, 0.00053 for service sire, 0.00533 for permanent environment, 0.00294 for animal, and 0.197 for error.

Predicted CR were calculated as 100 x (b7 x Fbull + b8 x Fmating + Agebull + Stud-year + Bull), where b7 and b8 were solutions for the linear regressions for service sire and mating inbreeding, respectively (equation [1]); Fbull was the bull’s own inbreeding coefficient deviated from the overall mean and Fmating was the bull’s average mating inbreeding coefficient deviated from the overall mean. The Agebull in the predictor was the solution for the bull age group corresponding to his age at the midpoint of the data to be predicted (set-aside data); stud-year was the most recent year solution for the bull’s stud, and Bull was the bull’s own individual (residual) service sire solution. Average CR in the future year were adjusted for herd, month of breeding, lactation, milk yield, DIM, and cow age using a simple fixed effects model (i.e., service sire mean CR were bull solutions from the fixed effects model). Correlations and mean differences between predicted and future year CR were calculated with all bulls combined and by group, based on number of matings for estimation.

When HYSPR groups were combined, the effects of year-state-month and lactation were also added to model equation [1] because these factors have a substantial effect on CR (Table 2Go) but can remain unaccounted for when combining groups. The year-state-month effect, in contrast to month of breeding alone, allowed month effects to vary across years and regions of the country. Preliminary analyses indicated that including lactation and year-state-month in equation [1], when using combined groups, was preferable to model [1] alone; predictions had slightly higher correlations with future year CR, lower mean differences, or both than when model [1] alone was used.

For comparison with results for HYSPR management groups, predictions were also computed from 3 reduced models, where the management groups were defined less narrowly. The reduced models utilized, in place of HYSPR in equation [1]: 1) herd-year-season + parity + registry status, 2) herd-year + year-state-month + parity + registry status, and 3) herd + year + season + parity + registry status. Herd-years were required to have a minimum of 5 breedings for each reduced model.

Preliminary Analyses
While a substantial number of options were available for management group formation, the multitude of options for the various combinations of other nuisance variables under consideration was larger yet. Thus, preliminary analyses were conducted to reduce the range of alternatives to a manageable number.

Choice of Milk Yield Variable.
One question that arises in regard to adjustment for milk yield in a bull fertility evaluation is the choice of variable to use. Choices are between lactational yield or some (reduced) function of the test-day (TD) yields. Clay and McDaniel (2001) used energy-corrected summit milk yield in their bull fertility evaluation, but their evaluation was based on first service only; summit yield may not be the best choice of milk variable for breedings later in lactation. Weigel (2004) used a categorical milk yield based on the mean TD yield for the first 100 DIM, which would be similar to summit yield. Other alternatives are readily apparent such as the TD yield immediately before the breeding or the average of the TD yields immediately before and after the breeding. Although the TD yields may be intuitively appealing on the grounds that they reflect the amount of stress closest to the time of breeding, it is by no means obvious that their use would be superior to use of the cow’s lactational (305-d 2x ME) yield. The TD records are observations from just a single day and as such would be subject to considerably more random fluctuation than the lactational record. Furthermore, it is by no means certain that the amount of stress early in lactation does not contribute to CR later in lactation. Likewise, production level after breeding may contribute to embryo implantation, survival, or both, and therefore, some measure of early lactation yield may not be best even for breedings early in lactation. Thus, several milk yield measurements were tested for their efficacy as a nuisance variable in the prediction of bull fertility. Milk yield measurements tested included lactational yield, TD yield for the test before insemination, and the average of the TD yields before and after insemination.

A further aspect of considering TD yields is that a cow that produces, say, 23 kg at 60 DIM was probably under less stress than a cow that produced 23 kg at 360 DIM. Results from preliminary analyses supported this contention. Correlation of predicted CR with future CR was higher when the TD yield before breeding was fit as an interaction with DIM, in contrast to fitting the 2 variables separately; the R2 in a fixed effects model was, as expected, also higher when DIM was included as an interaction with TD yield rather than fitting the 2 effects separately. Thus, further analyses considered functions of the TD yields only as part of an interaction with DIM.

In the preliminary comparisons, correlations of bull’s predicted CR with their future year CR were 40.4% (ME yield), 39.1% (average TD yield), and 38.6% (TD yield before breeding). Model R2 from fixed effects analyses were 11.2% for lactational yield and 11.1% for both of the TD measurements. Energy-corrected lactational milk yield was also tested, but it did not improve correlation with future year CR, relative to 305-d 2x ME milk yield alone. Thus, 305-d 2x ME lactational milk yield was the milk variable chosen for further consideration of nuisance variables to be included in bull fertility evaluations.

Modeling Quantitative Nuisance Variables [Cow Age, DIM, Lactational (305-d 2x ME) Milk Yield] as Covariates or Categorical Variables.
For quantitative variables, there is a choice between modeling the factors as covariates or categorical variables. Although a function of covariates could generally be found to correctly model almost any relationship of a quantitative independent variable with a dependent variable, that function may be complex and perhaps not even linear in the parameters. Even for a seemingly simple relationship where the dependent variable increases with the independent variable at a decreasing rate and then plateaus, the correct model using covariates is nonlinear in the parameters (Judge et al., 1988). Furthermore, loss in accuracy, relative to more complex models, may often be negligible when quantitative variables are fit as categorical factors rather than covariates, especially if categories are narrowly defined and have adequate subclass sample size for estimation. Thus, cow age, DIM, and milk yield were categorized and then examined to determine if they could be adequately modeled with linear and quadratic regression coefficients, or if modeling with categorical variables was to be preferred.

Cow age groups were formed by rounding the cow’s actual age in years to the nearest whole number. Because lactations were restricted to 5 and earlier, breedings at ages beyond 8 years were infrequent; thus, if age was beyond 8, age was set to 8. For preliminary analysis, 17 DIM classes were formed, with the first 16 being in 20 d increments from 30 to 350; the last class was for breedings between 351 and 365 DIM. Six categories were formed for milk yield based on standard deviation from the mean: < –2{sigma}, –1{sigma}, ..., > 2{sigma}. The limits for each milk class were <7,571, 9,657, 11,743, 13,830, 15,915, and >15,915 kg for classes 1 to 6, respectively. Milk yield was 305-d 2x ME lactational milk yield, not standardized for any other effects included in the model such as herd. Herds certainly vary in mean milk yield level, and therefore, herd will account for some differences in fertility due to milk yield. There is, nonetheless, still considerable variation in yield within herd, and therefore, inclusion of milk yield in the model would account for this additional variation.

Plots of the arithmetic mean CR by subclass are shown for each variable in Figure 1Go. The relationship with CR was not linear or quadratic for any of the 3 variables. As expected, CR decreased with increasing cow age, but the decline was not constant across ages. There was a total decline of about 2% from age groups 2 to 3, a lower rate of decline (total of 2.3%) across the 3 age groups of 3, 4, and 5, and then a sharp decline in CR starting with age group 7. Conception rate increased with DIM until about 110 d (class 4) and then declined until roughly 230 DIM (class 10), at which point CR basically plateaued. Further analyses, therefore, utilized only 11 DIM classes, where the 11th class was for all breedings beyond 230 d. The decrease in CR with increasing milk yield was largely linear except that peak CR was actually at the second rather than first level of yield. Factors such as poor health or excess body condition may have impeded fertility and milk yield at the lowest levels of production.


Figure 1
View larger version (14K):
[in this window]
[in a new window]

 
Figure 1. Relationships of conception rate (CR) with groups for cow age (1–8), DIM (1–17), and lactational milk yield (1–6).

 
Model R2 (in fixed effects analyses) were essentially equal when quantitative variables were fit with linear and quadratic covariates or as categorical variables. Furthermore, correlation of predictions with future year average CR was actually slightly higher when the quantitative factors were fit as categorical effects rather than linear and quadratic covariates. Thus, models for final examination considered cow age, DIM, and milk yield as categorical variables only. Groups were formed for each variable as described previously: 7 cow age groups, 11 DIM groups, and 6 milk yield groups.

Final Models Selected for Comparison
After preliminary analyses, a total of 12 models (Table 3Go) were compared. The first 2 models were fit to ascertain the benefit, if any, of including year-state-month and lactation in the model, in addition to the management group effect. Model 2 was the same as the model used for comparing management group options except model 2 in Table 3Go used categorical variables for DIM, cow age, and milk yield; thus, results from model 2 also allowed quantifying the differences between treating these factors as covariates or categorical variables. Several of the models (3, 4, 5, 11, and 12) dropped out various factors and were fit, not so much as potential candidates for a final model for routine evaluation, but rather to allow assessment of the consequences of exclusion of those factors, or conversely to quantify the benefit of including these factors in the model.


View this table:
[in this window]
[in a new window]

 
Table 3. Final models for comparison
 
It is well established that CR varies according to lactation, cow age, DIM, milk yield, and service number. It is also well known, however, that inclusion of unnecessary fixed effects in a model can lower accuracy of prediction (e.g., Henderson, 1975), and the various factors under consideration in this research are not all independent; service number and DIM, for example, are correlated as are lactation and cow age. Thus, it is not necessarily true that inclusion of all these various effects would result in improved prediction. Therefore, the remaining models (6 through 10, Table 3Go) tested various combinations of lactation, service number, and DIM. Arithmetic means varied nonadditively by lactation and service number, and therefore lactation x service number interaction was tested in addition to these 2 main effects separately.

The management group option used in the first 11 models was to combine to a group size of 20 but allow a herd-year into the evaluation with a minimum of 10 breedings. Other model aspects were similar to model [1]. Service sire variables included inbreeding of the service sire, inbreeding of the mating (potential embryo), bull age at the time of mating, stud-year, and a "residual" service sire effect unique to a given bull. The service sire variables, cow effects (permanent environment and breeding value), and error were random effects, whereas all other terms were fit as fixed effects. Variance-covariance matrices and scalar variance components were also the same as for model [1]. Predicted CR was calculated the same as for comparison of management group alternatives, namely as the sum of the service sire variables.

Short Interval Between Breedings
After assessing the factors described previously, one final variable was considered for inclusion as a nuisance effect. M. DeJarnette (Select Sires, Plain City, OH, personal communication) suggested that breedings less than 18 d apart may correspond to somewhat abnormal estrus cycles and, subsequently, result in lower CR. Thus, interval between breedings was also tested for inclusion in the model for bull fertility evaluation. Specifically, a breeding was coded as corresponding to (preceded by) a short estrus period if a previous breeding had occurred 17 or fewer days previously; because all breedings were required to be at least 10 d apart, this corresponded to matings where a previous breeding had occurred 10 to 17 d earlier. This short interval variable was included as a fixed, categorical factor with 2 levels (less than 17 d or not) in the model that was chosen as best among the other models considered.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Comparison of Management Group Options
Correlations, mean differences, and square roots of mean square errors with all bulls combined are shown in Table 4Go; for ease of presentation, all correlations were expressed as a percentage (i.e., multiplied by 100). In regard to interpretation of these various statistics, correlations between predicted CR and observed CR in a future year are approximate measures of accuracy of evaluation, similar to reliabilities for PTA; the correlations are actually underestimates of the true accuracy, but when comparing models, the one with the higher correlation has the greater accuracy. Mean differences are an assessment of bias; differences that are far from 0 would indicate that bias may exist in the evaluation and would warrant further investigation. For example, if, on average, the mean difference between predicted and observed future year CR was 5.0%, this could be indicating that CR is being overpredicted by the model. Values close to 0 are desirable for mean differences; a –0.1% mean difference is more desirable than a +10.0 difference. Mean square error is simply a composite measure of accuracy and bias. Bias and reduced accuracy both increase the magnitude of mean square error, and thus, smaller mean square errors are preferred.


View this table:
[in this window]
[in a new window]

 
Table 4. Comparison of management group alternatives across all 803 bulls used in comparisons
 
Except for the case where groups were only required to have a minimum of 3 breedings, combining groups resulted in a higher correlation (~accuracy) with future year CR than did requiring an absolute minimum group size. The amount of benefit from combining HYSPR groups also increased as the required/target group size increased. The reason combining was superior in terms of accuracy was simply because it allowed more data into the evaluation than did exclusion, which is the same reason the amount of benefit increased with increased requirement for group size; as the target group size increased, the amount of data excluded by the absolute requirement increased. For the minimum group size of 3, there was relatively little difference in the amount of data between the 2 options and thus little difference in the correlations.

Allowing herds with fewer than the target number of records was beneficial only when the target group size was 20; with minimum group sizes of 20, including herd-years with at least 10 breedings salvaged considerably more records than when minimum group sizes were 5 or 10. When combining was allowed, the effect of group size per se was largely negligible, although correlations did tend to increase slightly with increasing group size. Combining groups to a size of 20 and allowing herd-years with a minimum of 10 records into the evaluation maximized accuracy (54.91%), but the correlation with future year CR was only 0.8% lower when combining to a minimum group size of 5.

Although small differences among the correlations did occur, the correlations for the various management group alternatives were generally similar. Except for the options of requiring an absolute group size of 10 or 20 for all HYSPR, correlations ranged only 1.1% and were generally within a half of a percentage of each other. Even the reduced models had correlations similar to those for the HYSPR groups. Provided that excessive data exclusion is avoided (minimum of 10 or 20 breedings with no combining), formation of management groups will have only minimal impact on accuracy.

There was very little mean difference between predicted and future CR for any of the models tested, including the reduced models. It should be noted, however, that the data for estimation covered a relatively short time span. As additional years are added in future routine evaluations, allowing herd effects to change across time will be more important; reduced model 3 fit only the main effects of each factor and thus herd effects were treated as constant across years. The largest mean difference was for combined groups of size 20, allowing in herd-years with at least 10 breedings, but even this largest difference was less than a half percentage. Table 5Go presents the number of bulls for each number of matings (for estimation) subclass and Table 6Go presents the mean differences for each model by number of matings. Table 6Go shows that, for bulls with fewer matings, mean differences were near 0 and often closer to 0 for the combined 20, minimum of 10 grouping option. This characteristic was not reflected in the overall mean differences because mean differences for bulls with more matings tended to be positive. This result was also illustrated by the mean square errors in Table 4Go, which was minimized by the combined 20, minimum of 10 herd-year breedings option.


View this table:
[in this window]
[in a new window]

 
Table 5. Bulls used for comparisons: average number of matings in future year, number of bulls, and average bull age by number of matings for estimation group
 

View this table:
[in this window]
[in a new window]

 
Table 6. Mean differences (%)1 for management group alternatives, for each number of matings group
 
Differences among options for formation of management groups were generally small. Given that the option of combined 20 with a minimum of 10 breedings for the herd-year maximized the correlation with future year CR and minimized mean square error, it was chosen as the final alternative. It was expected that additional terms, such as service number; the categorization of milk, DIM, and cow age, or both would eliminate any mean difference (potential bias) that may have existed with the use of this option, an expectation explored in the next section.

Combining groups to obtain 20 records and allowing herd-years with at least 10 breedings resulted in a loss of only 0.45% of all records. Group size averaged 41.1 breedings/group and ranged from 10 to 1,028; 19% of all records were in groups with more than 100 matings and only 3.5% of all breedings were in groups with 20 or fewer matings. Only 26% of all management groups (accounting for 36% of all records) were uncombined. Although the proportion of records in combined groups was large, it should be noted that with registry class as part of the management group definition, even some large herds had combined groups, in particular large herds (herd-years) that had mostly grade cows but also had a small number of registered cows. Registry class was retained as part of the management group definition, in the event that registered and grade cows were treated differently in herds with both. Powell and Norman (1986) reported environmental differences in production for registered and grade cows, which prompted the original inclusion of registry class for use in management groups for production (Wiggans et al., 1988).

Comparisons of Final Models: Nuisance Variables other than Management Group
Comparisons of the 12 model predictions to future year average CR (mean differences, correlations, and (square roots of) mean square errors) for all 803 bulls combined are in Table 7Go; each statistic is sorted from best to worst model. The mean difference for basic model 2 (when DIM, cow age, and milk were categorized) was only –0.019% (Table 7Go), compared with 0.374% (Table 4Go) when these variables were fit with linear and quadratic covariates; the correlations were 54.97% (categorical) and 54.91% (covariates, Table 4Go). Inclusion of lactation and year-state-month in the model also resulted in lower mean differences and higher correlations (basic model 2 vs. basic model in Table 7Go). Thus, results for the 2 basic models supported categorization of DIM, cow age, and milk yield and inclusion of the terms year-state-month and lactation in the model for prediction.


View this table:
[in this window]
[in a new window]

 
Table 7. Mean differences (%)1, correlations, and square roots of mean square errors (MSE) for 12 final models selected for comparison2
 
The model without cow age resulted in the lowest mean difference. Cow age only accounts for age differences within lactation, and therefore the mean difference would not be expected to be large when cow age is excluded; furthermore, service sire usage may be largely random with respect to cow age. Similar results were found when DIM and milk yield were left out of the model. Dropping DIM from the model resulted in essentially no change in mean difference, relative to basic model 2. Although exclusion of milk yield did not cause a large mean difference between predicted and future year CR, the mean difference was higher with milk yield excluded from the model, suggesting that inclusion of milk yield does remove some bias from evaluations.

Perhaps the most notable result was the rather small difference among models, especially for mean differences. When all nuisance variables, including management group, were excluded from the model (labeled as Omit All in Table 7Go), the mean difference between predicted and future year CR was at an absolute maximum implying the removal of bias by these terms, but at the same time the mean difference of only –0.257% was small. The reduction in accuracy was more pronounced, however, suggesting that these factors do affect CR, and therefore should be included in models for prediction of service sire CR, but they occur largely at random across service sires. Of course, it is also true that a mean difference (bias) is more likely to occur for bulls with fewer matings than for bulls with many matings because bulls with fewer matings are less likely to be evenly distributed across the fixed effects. Bulls with 50 to 100, 101 to 200, and 201 to 300 matings had mean differences of –1.55, –1.83, and –0.72%, respectively, when all nuisance variables were excluded; for comparison, the corresponding mean differences for basic model 2 were –0.34, –0.62, and –0.06%. Thus, in regard to bias, nuisance variables are included primarily for the sake of bulls with fewer matings. Although bias may not be a large concern, at least for bulls with 1,000 or more matings, it is also important to note that the models without cow effects, cow age, milk yield, or DIM/ service number ranked among the lowest on correlation. Thus, these factors should be included in models for prediction of bull fertility.

Differences among the remaining models that considered various combinations of lactation, service number, and DIM, were present but were not large. The model that included service number without DIM was considered best among the models tested because it maximized correlation with future year CR, minimized mean square error, and had a mean difference similar to basic model 2. Only the model without cow age had a lower mean difference but this model had a 0.28% lower correlation than the selected model.

Short Interval Variable
Breedings preceded by short intervals had a simple average CR that was 9% lower than the average CR of breedings that were preceded by intervals of at least 18 d. The relative frequency of breedings preceded by short intervals, however, was only 2.5%, and therefore, the overall impact of this variable was not expected to be large. Inclusion of the short interval variable resulted in an overall correlation between predicted and future year average CR of 55.14%, essentially the same as the 55.17% correlation (Table 7Go) without it. Use of the short interval variable did result in a slight reduction in mean difference: –0.019 when included vs. –0.020 when not included. A larger impact, however, was found for bulls where at least 5% of their breedings were preceded by short intervals. Of the 803 bulls used for comparison, 93.5% had less than 5% short interval breedings. For the 52 bulls that had at least 5% short interval breedings (maximum percentage = 9%), the correlation with future year CR was 0.4% higher when the short interval variable was included. Thus, future bull fertility evaluation models will include a variable to account for this effect. It should be noted that while the majority of short intervals may correspond to abnormal heat cycles, misdiagnosed heats or perhaps even recording errors may also account for some of the short intervals. Whereas the exact cause of the short intervals between breedings may be of interest from a management or physiology perspective, the exact cause in terms of bull fertility evaluation is largely irrelevant; the relevant aspect here is simply whether or not including the term in the model improved evaluation.

Other Variables Affecting CR: AI Technician and Synchronization
Two factors that were not considered in this research were AI technician and synchronization. It has probably been known since the inception of AI, and has certainly been well documented by research (e.g., Ron et al., 1984; Taylor et al., 1985; Reurink et al., 1990; Van Doormaal, 1993), that there are differences among AI technicians in CR. There has also been some research to support an effect of synchronization on CR (Tenhagen et al., 2004), although this effect is not as well documented as that for AI technician.

Currently, AIPL does not receive information on either of these 2 factors. Whereas receipt of these variables may be feasible at some point in the future, the near future is unlikely. When these factors are constant within management groups (only 1 technician within a HYSPR or, all cows are or are not synchronized), they are of no concern because they will be completely accounted for by the management group effect. Even when some variation within management group exists, the management group may at least partly remove these effects; HYSPR groups with some synchronization/timed AI, for example, would have lower means, on average, than those without, and a wide variance among AI technicians within herd would not likely be tolerated for a long period of time.

It is also worthwhile to consider that the overall mean difference with no fixed effects in the model was only –0.257%. It is unlikely that technician and synchronization effects combined have a larger effect than even herd alone. Tenhagen et al. (2004) reported that synchronization lowered CR by 3.7% and season alone had a larger effect than that in this research (Table 2Go). Both Ron et al. (1984) and Van Doormaal (1993) reported a standard deviation among technician effects of about 4%, which is less than half of that attributable to the cow (approximately 9%). Thus, although it is not impossible that these factors could bias some subgroup of bulls, it is expected that, at least across all bulls, technician and synchronization effects will not bias bull fertility evaluations. However, acquiring this information may benefit accuracy to the extent that these factors vary within HYSPR groups.

Future Research
Routine acquisition of daily climate data, such as temperature and humidity, has been initiated at AIPL, but is still in the development and testing phase. Preliminary results have indicated bull fertility evaluations are, at most, improved only slightly when daily climate data are included in the model. While variation within month certainly occurs, the effects of HYSPR and year-state-month largely account for climate effects. Nonetheless, use of daily climate data can be further explored when routine acquisition of that data is finalized.


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Provided that major effects are accounted for and that excessive data exclusion is avoided, the various choices for formation of management groups have minimal impact on the quality of bull fertility evaluations. The final strategy selected for management group formation was to utilize HYSPR groups, combining groups across registry classes, seasons, and parities (in that order) until a minimum group size of 20 is achieved and allowing herd-years into the evaluation provided a minimum 10 breedings. This strategy maximized the correlation of bulls’ predictions with bulls’ future year average CR.

Standardized (305-d 2x ME) milk yield was chosen to model production in prediction of bull fertility because it maximized the correlation with future year average CR, relative to other functions of TD yields or energy-corrected milk yield. Cow age and milk yield provided better adjustments for prediction of service sire CR when fit as categorical variables than when fit as linear and quadratic covariates.

Breedings that were 17 or fewer days after a previous breeding averaged 9% lower CR, but occurred in a frequency of only 2.5%. Nonetheless, inclusion of a variable to account for short interval between breedings had no negative consequences on the evaluations overall and improved evaluations for bulls that had at least 5% of such breedings.

Combining management groups implies that some groups have a mixture of seasons and parities, which is accounted for by inclusion of additional terms in the model, namely year-state-month and lactation number. Use of service number alone provided better bull fertility evaluations than DIM or a combination of DIM and service number. Thus, the final nuisance variables selected for inclusion in the model for bull fertility evaluation were, in addition to HYSPR, year-state-month, lactation, service number, milk yield, cow age at breeding, a variable to account for the effect of short intervals between breedings, and the cow effect, partitioned as permanent environment and breeding value.

Bulls with fewer than 300 matings may have had a small (<2%) bias in their evaluations when the nuisance variables were excluded from the model for evaluation. Nonetheless, the primary benefit of the nuisance variables was improved accuracy rather than elimination of bias, suggesting that service sires are used largely at random with respect to the factors considered in this research. Information on AI technician and synchronization is not currently available at AIPL, but these effects are not expected to cause an overall bias in bull fertility evaluations; acquisition of this information would be desirable, however, because it may improve accuracy of evaluation. Future research could examine use of daily climatological data once routine acquisition is finalized.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Ignacy Misztal (University of Georgia) is gratefully acknowledged for the use of his BLUPF90 series of programs for solving mixed linear models. Dairy Records Management Systems, AgSource, and Agri-Tech Analytics are acknowledged for their supply of breeding data, as are the American dairy farmers, who pay for data collection in the United States. Appreciation is also extended to Tony Seykora and George Shook for providing comments and suggestions before submission.

Received for publication December 12, 2007. Accepted for publication March 18, 2008.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 


Al-Katanani, Y. M., D. W. Webb, and P. J. Hansen. 1999. Factors affecting seasonal variation in 90-day nonreturn rate to first service in lactating Holstein cows in a hot climate. J. Dairy Sci. 82:2611–2616.[Abstract]

Clay, J. S., and B. T. McDaniel. 2001. Computing mating bull fertility from DHI nonreturn data. J. Dairy Sci. 84:1238–1245.[Abstract]

García-Ispierto, I., F. López-Gatius, G. Bech-Sabat, P. Santolaria, J. L. Yániz, C. Nogareda, F. De Rensis, and M. López-Béjar. 2007. Climate factors affecting conception rate of high producing dairy cows in northeastern Spain. Theriogenology 67:1379–1385.[CrossRef][Medline]

Gwazdauskas, F. C., C. J. Wilcox, and W. W. Thatcher. 1975. Environmental and managemental factors affecting conception rate in a subtropical climate. J. Dairy Sci. 58:88–92.[Abstract/Free Full Text]

Henderson, C. R. 1975. Comparison of alternative sire evaluation methods. J. Anim. Sci. 41:760–770.[Abstract/Free Full Text]

Hillers, J. K., P. L. Senger, R. L. Darlington, and W. N. Fleming. 1984. Effects of production, season, age of cow, days dry, and days in milk on conception to first service in large commercial dairy herds. J. Dairy Sci. 67:861–867.[Abstract/Free Full Text]

Judge, G. G., R. C. Hill, W. E. Griffiths, H. Lutkepohl, and T. Lee. 1988. Introduction to the Theory and Practice of Econometrics, 2nd ed. John Wiley and Sons, New York, NY.

Kuhn, M. T., and J. L. Hutchison. 2006. Methodology for prediction of bull fertility from field data. J. Dairy Sci. 89(Suppl. 1):15. (Abstr.)[Abstract/Free Full Text]

Kuhn, M. T., J. L. Hutchison, and J. S. Clay. 2004. Prediction of service sire fertility. J. Dairy Sci. 87(Suppl. 1):412. (Abstr.)

Powell, R. L., and H. D. Norman. 1986. Genetic and environmental differences between registered and grade Holstein cows. J. Dairy Sci. 69:2897–2907.[Abstract/Free Full Text]

Ravagnolo, O., and I. Misztal. 2002. Effect of heat stress on nonreturn rate in Holsteins: Fixed-model analyses. J. Dairy Sci. 85:3101–3106.[Abstract/Free Full Text]

Reurink, A., J. H. G. Den Daas, and J. B. M. Wilmink. 1990. Effects of AI sires and technicians on non-return rates in the Netherlands. Livest. Prod. Sci. 26:107–118.[CrossRef]

Ron, M., R. Bar-Anan, and G. R. Wiggans. 1984. Factors affecting conception rate of Israeli Holstein cattle. J. Dairy Sci. 67:854–860.[Abstract/Free Full Text]

Stålhammar, E., L. Janson, and J. Philipsson. 1994. Genetic studies on fertility in AI bulls. II. Environmental and genetic effects on non-return rates of young bulls. Anim. Reprod. Sci. 34:193–207.[CrossRef]

Taylor, J. F., R. W. Everett, and B. Bean. 1985. Systematic environmental, direct, and service sire effects on conception rate in artificially inseminated Holstein cows. J. Dairy Sci. 68:3004–3022.[Abstract/Free Full Text]

Tenhagen, B.-A., M. Drillich, R. Surholt, and W. Heuwieser. 2004. Comparison of timed AI after synchronized ovulation to AI at estrus: Reproductive and economic considerations. J. Dairy Sci. 87:85–94.[Abstract/Free Full Text]

Van Doormaal, B. J. 1993. Linear model evaluations of non-return rates for dairy and beef bulls in Canadian AI. Can. J. Anim. Sci. 73:795–804.

Weigel, K. A. 2004. Improving the reproductive efficiency of dairy cattle through genetic selection. J. Dairy Sci. 87(E Suppl.):E86–E92.[Abstract/Free Full Text]

Wiggans, G. R., I. Misztal, and L. D. Van Vleck. 1988. Implementation of an animal model for genetic evaluation of dairy cattle in the United States. J. Dairy Sci. 71(Suppl. 2):54–69.



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow __Erratum__
Right arrow An erratum has been published
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kuhn, M. T.
Right arrow Articles by Norman, H. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kuhn, M. T.
Right arrow Articles by Norman, H. D.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS