JDS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Carabaño, M. J.
Right arrow Articles by Serrano, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Carabaño, M. J.
Right arrow Articles by Serrano, M.
J. Dairy Sci. 90:1044-1057
© American Dairy Science Association, 2007.

Exploring the Use of Random Regression Models with Legendre Polynomials to Analyze Measures of Volume of Ejaculate in Holstein Bulls

M. J. Carabaño*,1, C. Díaz*, C. Ugarte{dagger} and M. Serrano*

* Departamento de Mejora Genética Animal, INIA, 28040 Madrid, Spain
{dagger} Departamento Técnico, ABEREKIN S.A., Parque Tecnológico, 48160 Derio (Bizkaia), Spain

1 Corresponding author: mjc{at}inia.es


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Artificial insemination centers routinely collect records of quantity and quality of semen of bulls throughout the animals’ productive period. The goal of this paper was to explore the use of random regression models with orthogonal polynomials to analyze repeated measures of semen production of Spanish Holstein bulls. A total of 8,773 records of volume of first ejaculate (VFE) collected between 12 and 30 mo of age from 213 Spanish Holstein bulls was analyzed under alternative random regression models. Legendre polynomial functions of increasing order (0 to 6) were fitted to the average trajectory, additive genetic and permanent environmental effects. Age at collection and days in production were used as time variables. Heterogeneous and homogeneous residual variances were alternatively assumed. Analyses were carried out within a Bayesian framework. The logarithm of the marginal density and the cross-validation predictive ability of the data were used as model comparison criteria. Based on both criteria, age at collection as a time variable and heterogeneous residuals models are recommended to analyze changes of VFE over time. Both criteria indicated that fitting random curves for genetic and permanent environmental components as well as for the average trajector improved the quality of models. Furthermore, models with a higher order polynomial for the permanent environmental (5 to 6) than for the genetic components (4 to 5) and the average trajectory (2 to 3) tended to perform best. High-order polynomials were needed to accommodate the highly oscillating nature of the phenotypic values. Heritability and repeatability estimates, disregarding the extremes of the studied period, ranged from 0.15 to 0.35 and from 0.20 to 0.50, respectively, indicating that selection for VFE may be effective at any stage. Small differences among models were observed. Apart from the extremes, estimated correlations between ages decreased steadily from 0.9 and 0.4 for measures 1 mo apart to 0.4 and 0.2 for most distant measures for additive genetic and phenotypic components, respectively. Further investigation to account for environmental factors that may be responsible for the oscillating observations of VFE is needed.

Key Words: repeated measures • random regression • volume of ejaculate


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Functional traits have become an important part of the breeding objectives in dairy cattle. Although functionality has always been viewed as a female characteristic, an increasing interest in considering male traits exists at AI centers. In particular, improving semen production traits is deemed important to reduce costs.

Artificial insemination centers routinely keep data on the quantity and quality of collected semen. Many records per bull collected throughout the animal’s productive period are available. Repeatability models have been previously used to analyze the repeated records on semen characteristics in cattle (Mathevon et al., 1998). However, the underlying assumptions in these models of equal variances over time and genetic correlation of unity among measures taken at any time might be considered very restrictive. A multitrait model (MTM) that treats measures obtained at different time periods as different traits would be another possible approach. Disadvantages of this type of analysis are the arbitrary separation of traits leading to a discontinuous treatment of time and the large number of parameters to be estimated. Random regression models (RRM; Henderson, 1982; Schaeffer and Dekkers, 1994) have been widely used in animal breeding in recent years to treat longitudinal traits. This type of model provides a continuous treatment of time and is able to incorporate heterogeneous variances and covariances among measures along time with a potentially reduced number of parameters compared with the multiple trait approach. Moreover, a smooth prediction of the individual genetic deviation along time can be obtained, which provides additional information for selection. The use of polynomials has been advocated as a sensible approach to model smoothed covariance functions in longitudinal data because their flexibility guarantees that any particular form is possible (Kirkpatrick et al., 1990). In addition, orthogonal polynomials provide lower correlations among the random regression coefficients and yield estimates of the covariance matrices that tend to be more robust over different data sets and have computational advantages in terms of faster convergence (Schaeffer, 2004).

The goal of this paper was to explore the use of random regression models with orthogonal polynomials to analyze repeated measures of semen production of Spanish Holstein bulls. The volume of the first ejaculate (VFE), a trait of continuous nature, was analyzed.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Data
Records of sperm characteristics of Spanish Holstein bulls collected from 1988 to 2002 in a Spanish AI center were considered for this study. Two sperm ejaculates per day of collection are obtained at weekly intervals from 12 to 30 mo of age within the routine semen collection scheme of the AI center. Between 30 and 54 mo of age, sperm collection is stopped until genetic evaluations of bulls are available. Sperm collection then resumes from 54 to 96 mo of age only in those bulls with high genetic merit for production and cow functional traits. In this study, only first ejaculates collected over the production period (from 12 to 30 mo of age) were considered.

Data were edited to include only ejaculates from bulls with at least 8 collections beginning from 12 to 19 mo of age. Records of volume <0.5 and >15 mL were discarded. Moreover, data pertaining to levels of month-year of collection (acting as contemporary group in all models) with less than 6 observations were not included in the analyses. This edit process eliminated 10% of animals in the original data set, but was considered necessary to ensure the accuracy of the estimated parameters and should not influence the comparison of the proposed models.

Finally, after edits, data used in the statistical analyses comprised 8,773 records of VFE pertaining to 213 bulls. Table 1Go shows data and pedigree characteristics as well as information about the distribution of data in levels of systematic factors used in the subsequent analyses. Average VFE was 5.63 mL; this value is within the range of average volume of ejaculate reported for Holstein bulls by Mathevon et al. (1998). The minimum, maximum, and average number of ejaculates per bull were 8, 92, and 57, respectively. The average period of semen production was 146.8 d. The pedigree file consisted of 1,201 animals; the pedigree included sire and dam information. Pedigree information was traced back to grandparents of the animals with records.


View this table:
[in this window]
[in a new window]

 
Table 1. Summary of data characteristics, pedigree structure for total number of animals in the pedigree file (Total), and for animals with records (Data), and distribution of records in levels of environmental factors used in the analysis
 
Models
The general equation for all models analyzed was as follows:


Formula 1[1]

where yijkl is the lth VFE of bull k, YMi (i = 1,...,177) is the effect of the year-month of collection, referred as contemporary group, and YFj (j = 1,...,15) is year of first collection. Parameters ßjm, used to fit the average trajectory, are linear regression coefficients nested to the jth level of the year of first collection effect, and, {alpha}km and {omega}km are random genetic and permanent environmental regression coefficients for animal k, respectively. The terms PXm(t), PZam(t), and PZpm(t) are the corresponding mth Legendre polynomials evaluated at standardized time t. Two time variables, age at collection (AC) or days in production (DP), were considered. Finally, eijkl is the temporary measurement error.

Legendre polynomials were generated using the recursion formula (see, for example, Weisstein, 1999):


Formula 1

with Po(t) = 1 and P1(t) = t.

Pj(t) was the polynomial of order j, and t was the standardized time variable in the interval [–1, 1]. The 2 time variables used were standardized to lie between –1 and +1 before evaluating the polynomials, where –1 represents d 365 and d 1, and +1 represents d 1,000 and 648 for AC and DP, respectively.

Several models were fitted which differed in the definition of the time variable (AC and DP), the order of the polynomials, and the residual variance structure. Polynomials up to sixth order for systematic and random factors were used. Models were named as L(i,j,k) to indicate the order of the polynomial fitted for systematic (i), additive genetic (j), and permanent environmental (k) effects. Under this notation, repeatability models, which assume a constant genetic and permanent environmental value for repeated measures along time, were noted as L(·,0,0). Alternatively, homogeneous (HOM) or heterogeneous (HET) residual variances were considered in each model.

Analyzing this data with all models that can be generated from equation [1] combining all possible polynomial orders (0 to 6) for the systematic, additive genetic, and permanent environmental factors, the 2 time variables (AC vs. DP) and the 2 residual variance patterns (HOM vs. HET) was not feasible, provided the large number of possible cases (7 x 7 x 7 x 2 x 2 = 1,372). Therefore, a reduced number of models in the context of repeatability models were chosen based on the model comparison criteria. Repeatability models were viewed as the reference or baseline point because they were the simplest models to be fitted and they have been used to analyze semen characteristics in most of the previous studies. The comparison among repeatability models aimed at providing a reduced number of models in terms of the polynomial degree to be fitted to the average production curves, the time variable to be considered, and the pattern of residual variances to be applied, to perform further comparison of models fitting random curves for genetic and permanent environmental components. Based on the results obtained from the analysis of repeatability models, a series of models including all possible combinations for the polynomial degrees fitted to the additive genetic and permanent environmental component were studied.

Statistical Analyses
Inference on the parameters of interest (position and dispersion parameters) and model comparisons were carried out within a Bayesian context. The sampling distribution was


Formula 1

where b, {alpha}, and {omega} are vectors including all systematic environmental factors, genetic, and permanent environmental regression coefficients, respectively, and R is the residual (co)variance matrix, assumed as diagonal with homogeneous or heterogeneous residual variances, {sigma}ei2, for i = 1, {delta}, with {delta} being the number of intervals where residual variance is assumed to change. After several trials with larger numbers of intervals, {delta} was set to 6. The intervals of residual variances were finally defined by the series of points [365, 430, 520, 580, 640, 700, 1,000] for AC and [1, 125, 185, 305, 365, 485, 648] for DP.

Conjugate priors were used for systematic and dispersion parameters. For the systematic parameters, multivariate normal (MVN) distributions were assumed:


Formula 1

where {xi} is a positive, scalar hyper parameter and large enough so as to give small weight to prior information (more precisely, {xi} = 106); A is the relationship matrix; and {sum}a and {sum}p are the matrices that contain the (co)variances among the additive genetic and permanent environmental random regression coefficients, respectively.

For the dispersion parameters, scaled inverse {chi}2 and scaled inverse Wishart (IW) distributions were used:


Formula 1

A value of 4 was assigned to the degrees of belief, {nu}ei, for the {chi}–2 distributions. Degrees of belief in the IW distributions, {nu}a and {nu}p, were 12, being always larger than the respective matrix dimensions to avoid degenerate forms (Gelman et al., 1995). Values for the scalars Sei2 and the matrices Sa2 and Sp2 were obtained from previous REML estimates.

Marginal inferences on parameters of interest were drawn from their corresponding conditional posterior distributions through a Gibbs sampling scheme. The burn-in period was determined by convergence of coupled chains (Johnson, 1996; García-Cortés et al., 1997) and set to 10,000 iterations. Number of iterations run after the burn-in was 110,000, which yielded effective chain sizes of several thousands for the components of variance.

The additive genetic deviation for individual animals at time t and the additive and permanent environmental variances at time t were computed from the estimated solutions for genetic regression coefficients and the estimated (co)variance matrices as in Jamrozik and Schaeffer (1997). Genetic parameters (heritability and correlations) at different time points were obtained from the estimated (co)variances.

The log of the marginal density of the data (LMD) and the cross-validation predictive densities of the data (Gelfand et al., 1992) were computed to assess the performance of the alternative models. These criteria were chosen to have measures of different model characteristics, goodness of fit in the case of LMD, and predictive ability of missing observations in the case of the predictive density of the data (D).

The computation of the LMD was based on the estimator suggested by Newton and Raftery (1994). In the cross-validation, the checking function used to measure the discrepancy between the observed and predicted data was the difference between an observed value, yr, and the prediction of this value, Yr, obtained through the predictive density of the data excluding the data of interest (y(r)). The best model was identified as the one with the minimum Formula 1 dr2, with dr = EyrYr) [yr – Yr], where n is the number of observations. An importance sampling scheme suggested by Gelfand et al. (1992) was implemented to evaluate dr. The joint posterior density of the parameters was used as an importance distribution. More details on the computing strategy followed can be found in López-Romero et al. (2003). The required computer program was developed in Fortran90 language and tested with simulated data.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Figures 1a and 1bGo show the sample average and variance, respectively, of VFE records grouped by month of age and by months in production at collection. The average VFE showed a trend to increase with time measured as AC or as DP. The initial slope was steeper for AC due to a lower VFE for animals in the first month of age than for animals in the first month of production. This difference can be explained by the variability in ages of bulls in the beginning of the collection period. For both time variables, a curvilinear trend for the average VFE was observed. Sample variances also showed a trend to increase with time for AC, with a waving pattern at the end of the period, probably associated with the lower amount of data available. For DP, smaller changes in variability were observed with a trend to increase and to decrease afterwards, also presenting a waving pattern in the last period.


Figure 1
View larger version (17K):
[in this window]
[in a new window]

 
Figure 1. (a) Average and (b) phenotypic variance of volume of ejaculate grouped by month of age ({square}) and by month from first collection ({blacksquare}). Number of observations per month are shown in panel a.

 
For raw observations of individual animals, a very oscillating trajectory was found, showing no clear common trend along time. This is illustrated in Figure 2Go, in which data from 3 bulls are presented. These bulls were chosen among animals having information over almost the entire period studied. Animals representing different levels of production (low, intermediate, and high) that allowed for a more clear visualization of the individual patterns of production were chosen. Similar waving patterns were observed for other animals chosen at random (not shown). The highly oscillating pattern of raw measures for individual animals has been observed for other longitudinal traits such as milk production along lactation (see, for example, Olori et al., 1999).


Figure 2
View larger version (15K):
[in this window]
[in a new window]

 
Figure 2. Phenotypic records from 3 bulls with low (•), medium ({blacktriangleup}), and high ({blacksquare}) values for volume of first ejaculate (VFE, mL).

 
Figures 3a and 3bGo show the values for the model comparison criteria, LMD and D, respectively, for repeatability models using AC and DP as time variables and under heterogeneous and homogeneous residual variances. Both criteria ranked models fitting no average trend [L(0,0,0)] or a linear trend [L(1,0,0)] as the worst ones. Goodness of fit criterion, LMD, showed little or no improvement when the polynomial degree increased over third order. On the other hand, the predictive ability criterion, D, showed increasingly worse results when the polynomial degree was larger than 3. For homologous models for the polynomial degree and treatment of residual variance, both criteria favored the use of AC as time variable over the use of DP. This was particularly so under heterogeneity of variances in the case of the LMD criteria. Treating residual variance as heterogeneous was clearly advantageous in terms of LMD but had little influence on the predictive ability, D.


Figure 3
View larger version (12K):
[in this window]
[in a new window]

 
Figure 3. a) Log of the marginal density (LMD) and b) cross-validation predictive ability (D) for repeatability models of increasing order of polynomial for the average trajectory under heterogeneous (solid) and homogeneous (dashed) residual variances, considering age at collection ({blacksquare}) and days in production ({blacktriangleup}) as time variables. L(i,0,0), indicates the polynomial degrees fitted to the average trajectory (i = 0,...,6), additive genetic (0), and permanent environmental effects (0).

 
Estimates of residual variances followed the pattern of phenotypic variances shown in Figure 1Go. Residual variance estimates increased steadily with time for AC-HET models (from 1.17 to 2.57). Estimates for DP-HET models were larger than estimates for AC-HET models at the beginning of the period (1.53), peaked in the interval [305, 365] days (2.96), decreased to a minimum value in the next interval [366, 485] days (2.26) and increased in the last period (up to 2.47). The larger differences in residual variance estimates for AC vs. DP models were in agreement with the larger differences in performance of models AC-HOM vs. AC-HET compared with the observed differences between DP-HOM and DP-HET for the LMD criterion (see Figure 3aGo). Estimates of residual variances were larger for L(0,0,0) and L(1,0,0) models and very close for the rest of the models. Additive genetic and permanent environmental variance estimates for model L(0,0,0) (results not shown) tended to be around 20 and 40% larger, respectively, than estimates for models fitting an average trend. Thus, ignoring the average time trend not only resulted in the poorest fitting and worst predictive ability (as shown in Figure 3Go), but also seemed to result in an overestimation of the components of variances, particularly of the permanent environmental variance. For AC-HOM and DP-HOM models, heritability and repeatability estimates were 0.15 and 0.35 (AC) and 0.16 and 0.36 (DP), respectively. For AC-HET and DP-HET models, estimates of heritability ranged from 0.12 to 0.20 and 0.12 to 0.27, respectively, and estimates of repeatability ranged from 0.30 to 0.48 and 0.27 to 0.42, respectively. These figures are in agreement with estimates found in the literature under repeatability models and homogeneous residual variance. Mathevon et al. (1998) found heritabilities of 0.24 and 0.44 for VFE in young (up to 30 mo old) and mature (between 4 and 6 yr old) AI Holstein bulls, respectively. These same authors reported a value of 0.45 for the repeatability of VFE for young bulls and provided range of estimates in the literature from 0.23 to 0.71 from repeatability models. Ducrocq and Humblot (1995) found a large heritability value of 0.65, but in this case the mean of measures of VFE was used as the dependent variable.

The results obtained for the repeatability models suggested that models using polynomials of order 2 or 3 for the average production curve, AC as the time variable, and heterogeneous residual variances were used to further investigate the use of RRM for the additive genetic and permanent environmental components. Then, a series of AC-HET models including all possible combinations for the polynomial degrees fitted to the additive genetic and permanent environmental component were analyzed, fitting a second- and third-order polynomial for the average trend. Results of the series fitting a second- or a third-order polynomial for the average trajectory were very similar. Figures 4a and 4bGo show results for the model comparison criteria for models fitting a third-order polynomial to the average trend. Series represent changes in the comparison criteria when the order for the permanent environmental polynomial increased for a given order of the additive genetic polynomial. Any of the models presented in these figures showed improved goodness of fit, LMD, and predictive ability (D) over repeatability models shown in Figures 3a and 3bGo. Values of LMD and D criteria for model L(3,0,0) were –15,325 and 2.07, respectively, vs. values of –15,173 and 2.03 obtained for the worst random regression model, L(3,2,2), shown in Figures 4a and 4bGo. This indicates that fitting random curves to the genetic and permanent environmental effects will improve the quality of the model, and therefore the quality of the prediction of the unknowns of interest.


Figure 4
View larger version (12K):
[in this window]
[in a new window]

 
Figure 4. a) Log of the marginal density (LMD) and b) cross-validation predictive ability (D) for series of models with a third-order polynomial fitted to the average trajectory and polynomials of increasing order for the additive genetic and permanent environmental) under heterogeneous residual variance, and, considering age at collection as time variable. L(3,j,k) indicates the polynomial degrees fitted to the average trajectory (3), additive genetic effect [j = 2(+), 3({blacktriangleup}), 4({blacksquare}), 5({diamondsuit}), 6(•)] and permanent environmental component (k = 2,...,6).

 
Both criteria, LMD and D, pointed at models fitting a low degree (2 or 3) for both random components as the worst models. For LMD, increasing the order of the polynomial for one component resulted in large improvements when the order of the polynomial for the other component was fixed to a low value (2 or 3). However, the improvement of increasing the order of one component was small when the other component was fixed to a high value. For example, the difference in LMD for models L(3,2,2) vs. L(3,2,6) was 101 units, whereas this difference decreased to 20 units for models L(3,6,2) vs. L(3,6,6). This would indicate that higher degree polynomials, which provide greater flexibility, are needed to obtain a good fit. Overall, for LMD, models with a fifth or sixth order for permanent environmental effect and fourth, fifth, or sixth order for the additive genetic effect provided the best fit. For the predictive ability criterion, D, models fitting a fifth or sixth degree polynomial for one component and a second or third degree for the other component were the best ones. However, models fitting polynomials of order 4 or 5 for one component and 5 or 6 for the other component showed a predictive ability quite close to the best models for the D criterion and were also among the best for LMD. Among all possible combinations of models shown in Figures 4a and 4bGo, those fitting a larger order for the permanent environmental effect than for the additive effect seemed to have a better performance for both criteria than models reversing the order of the polynomials [e.g., model L(3,4,6) showed more favorable LMD and D values than model L(3,6,4)].

Polynomials fitting the same order for both components are fit in nearly all applications of RRM to longitudinal traits in animal breeding. Fitting the same order to all components allows for equal flexibility of both curves and avoids counterbalance effects. However, some studies (e.g., Pool and Meuwissen, 2000; López-Romero et al., 2003) dealing with milk production measured along lactation have found evidence of better performance of models fitting a lower order for the genetic component than for the permanent environmental effect. In fact, a lower order of polynomial for the genetic component would be biologically more sensible provided that larger orders of polynomials are likely to generate waving patterns, which might be unexpected shapes for the effect of the genotype over time. On the other hand, given that phenotypic measures for longitudinal data measured along time in animals are usually quite oscillatory, environmental effects linked to individual animals could be expected to waver and might require larger order polynomials. Larger order polynomials are more flexible and can accommodate a wider variety of shapes. In this study, models with low order for one component and large order for the other component provided similar values for the model comparison criteria. This was true regardless of the component bearing the high or the low order. Counterbalance effects might then be present in the sense that the component with the large polynomial order component is absorbing extra variation along time not accounted for by the other component. On the other hand, as mentioned above, some evidence of better performance of models with larger orders for the permanent environmental component has been found. From all these considerations, models fitting polynomials of order 4, 5, or 6 for both components and models L(3,4,5) and L(3,4,6), which showed close to optimal values for both comparison criteria, were further compared in terms of genetic value prediction and variance components estimation. These 2 models were also fitted using DP as the time variable and both homogeneous and heterogeneous residual variance, to check the consistency of results in terms of models of choice between the repeatability and RRM models (results not shown). A very similar pattern as the one shown in Figures 3a and 3bGo was observed in that case. Using AC instead of DP improved both criteria. Considering heterogeneity of residual variance also improved both criteria for the AC models.

Figure 5Go shows the prediction of genetic merit at different ages of the 3 bulls shown in Figure 2Go. The curves shown in Figure 5Go were obtained under the models selected from the results for the comparison criteria, considering AC as time variable and heterogeneous residual variance. Predicted genetic merits along time showed a reasonably smooth trajectory for models including polynomials of order 4 or 5 for the additive genetic merit. However, a largely oscillatory pattern for polynomials of sixth degree was found. Rapid and frequent changes in the genetic merit of quantitative traits are not biologically understandable; therefore, models fitting a sixth order polynomial would not be recommended. The 3 individuals showed different patterns of genetic merit along time, in terms of the average level of VFE and of the ability to maintain the production level. The bulls showing high and intermediate levels of VFE at the beginning of the period seemed to be less persistent than the bull that showed the lowest level of initial VFE. A variety of patterns was observed for other sampled animals (not shown). In this sense, AI centers are interested in bulls that produce the largest VFE in the shortest period of time to reduce the collection period and the associated costs. Therefore, the goal would be to select bulls with a high starting level of VFE and good persistency over time.


Figure 5
View larger version (13K):
[in this window]
[in a new window]

 
Figure 5. Predicted genetic merit (PGM) of volume of ejaculate along age for the 3 bulls in Figure 2Go with low (below), medium (intermediate), and high (above) level of production under models with heterogeneous residual variance and polynomials L(3,4,4) (–{blacksquare}–), L(3,4,5) (---{diamondsuit}---), L(3,5,5) (–{diamondsuit}–), L(3,4,6) (---•---), and L(3,6,6) (–•–).

 
Table 2Go shows the posterior means of the (co)variance components for the random regression coefficients and for the residual variances for the AC-HET L(3,6,6) model. Estimates for lower order polynomials (not shown) tended to yield (co)variance matrices quite similar to the corresponding submatrices of the one shown in this table. Estimates of the residual variance increased with time (from 1.04 up to 2.33), showing smaller values than the ones observed for the repeatability models.


View this table:
[in this window]
[in a new window]

 
Table 2. Posterior means of the (co)variances for additive genetic (AG) (above diagonal) and permanent environmental (PE) (below diagonal) random regression coefficients and of the residual variances ({sigma}e2) under age at collection time variable (t) for a model fitting polynomials of orders 3, 6, and 6 for systematic, additive genetic, and permanent environmental effects, respectively
 
Eigen values and eigenvectors of the estimated matrices of variance components for the random regression coefficients in Table 2Go were calculated to investigate the existence of dominant "patterns of inheritance" for the genetic component (Kirkpatrick et al., 1990) and the possibility of using reduced rank for both (co)variance matrices. This would be the case if only a few eigen values explained the majority of the total variability. Moreover, if the new variables obtained from the corresponding eigenfunctions had a biological interpretation, they could be used as uncorrelated selection criteria. Several studies have found this situation for milk production along lactation, in which only 2 eigen values, associated with a constant level of production and with a measure of persistency, are found to be significantly different from zero (e.g., Van der Werf et al., 1998). However, in this study, the percentage of total variability explained by each eigenvector ranged from 9 to 23 for both components in model L(3,6,6), indicating that all eigen values represent a substantial portion of the total variability. Moreover, the corresponding eigenfunctions did not seem to have any biological meaning. This means that reducing the number of variables to be considered or using the new functions as independent selection criteria does not seem feasible for this data.

Figures 6a and 6bGo show estimates of genetic and permanent environmental variances obtained from the corresponding posterior means of the (co)variances among the random regression coefficients. Results for the AC-HET models previously chosen are shown. Estimates of variances at the extremes of the studied period were abnormally high for both components. The scale of the ordinate axis in Figures 6a and 6bGo was truncated to allow for a better visualization of the pattern of the variances along time for nonextreme periods. Extreme values not shown in the figures went up to 2.8 for the genetic component in model L(3,6,6) and up to 2.9 for the permanent environmental component for models L(3,4,6) and L(3,6,6). "End-of-range" problems (Meyer, 2005) have been found in many applications of RRM using high-order polynomials. Apart from the extreme values, estimates of the additive genetic variances tended to be slightly smaller than the estimated permanent environmental variances. In addition, the trajectories for the additive genetic variances were also flatter and smoother than the corresponding trajectories for the permanent environmental components. This would be expected given that the variability in the genetic merit is not likely to undergo abrupt changes with time, whereas the permanent environment variability is more likely to follow the changes observed in the phenotypic variability, which is not accounted for by the residual component of variance. Curves fitting polynomials of order 6 tended to peak at around d 800, which is in accordance with the trend of phenotypic variances to peak in an oscillatory period from d 755 to 935, approximately.


Figure 6
View larger version (15K):
[in this window]
[in a new window]

 
Figure 6. a) Estimated genetic variance (EGV) and b) permanent environmental variance (EPV) along age at collection, under models with heterogeneous residual variance and polynomials L(3,4,4) (–{blacksquare}–), L(3,4,5) (---{diamondsuit}---), L(3,5,5) (–{diamondsuit}–), L(3,4,6) (---•---), and L(3,6,6) (–•–).

 
Heritability and repeatability estimates under the AC-HET selected models are shown in Figure 7Go. Border effects appear again for these parameters, as a consequence of the extreme values of the estimated additive genetic and permanent environmental variances. Abrupt changes observed for the parameter estimates presented in this figure correspond to changes in residual variance interval. Apart from the border effects, both parameter estimates tended to decrease with time, due to the increase in the estimated residual variability, which was not compensated by the increase of the genetic and permanent environmental components along time. Heritability and repeatability estimates, disregarding the extremes of the studied period, ranged from 0.15 to 0.35 and 0.20 to 0.50, respectively. Small differences among models were observed.


Figure 7
View larger version (11K):
[in this window]
[in a new window]

 
Figure 7. Estimated heritability (h2, solid) and repeatability (rep, dashed) considering age at collection as time variable under models with heterogeneous residual variance and polynomials L(3,4,4) ({blacksquare}), L(3,4,5) ({diamondsuit}), L(3,5,5) ({diamond}), L(3,4,6) (•), and, L(3,6,6) ({circ}).

 
Estimates of genetic and phenotypic correlations between selected days of AC for model L(3,4,6) are presented in Table 3Go. Estimated correlations decreased as the distance between measures increased. Within this pattern, estimated correlations involving ages at the extreme of the studied period were substantially lower and more oscillating than other correlations. Estimates of genetic correlations were larger than estimates of phenotypic correlations. Apart from the extremes, estimated additive genetic and phenotypic correlations between ages decreased steadily from 0.9 and 0.4 for measures 1 mo apart to 0.40 and 0.20 for most distant measures, respectively. Fitting larger order polynomials for the additive genetic component [L(3,5,5), L(3,6,6), not shown] resulted in reduced correlations among measures and more oscillating values for the correlations involving extreme data.


View this table:
[in this window]
[in a new window]

 
Table 3. Phenotypic (below diagonal) and genetic (above diagonal) correlations between volume of ejaculate at selected days of age, estimated from a model fitting polynomials of orders 3, 4, and 6 for systematic, additive genetic, and permanent environment, respectively
 
In summary, the use of Legendre polynomials of high order seems to be required to obtain a good fit of this data. However, as in many other studies dealing with RRM with polynomial functions, odd behaviors of the estimated variances and genetic merits at the edges and "wiggly" patterns have been observed when polynomials of high order were used. Polynomials have builtin constraints (fixed maximum number of intercepts, extreme and inflection points determined by the order of the polynomial and increases or decreases without bound at ends of data domain), which may be responsible for the observed results. Moreover, the distribution of observations along time was not uniform, showing a smaller proportion of observations at the extremes, particularly for the last part of the studied period (only 10% of the observations were collected in the last quarter of the period). This might be also responsible for the odd behavior of the polynomial functions in this study.

Apart from the problems associated with the use of polynomial functions, the highly oscillating nature of the phenotypic observations (see Figure 2Go) might result in predicted patterns for genetic and environmental components trying to depict the same kind of pattern. This may be the case when the models considered are not removing the environmental effects causing the abrupt changes in the response variable along time. An oscillatory genetic and permanent environmental pattern would also explain the low estimated correlations along time, but it is not biologically plausible. Nobre et al. (2003) recommend checking the consistency between the estimates under RRM and MTM. This comparison also aims in our case at helping in determining if the oscillatory behavior of the estimated trajectories under the RRM is only due to the polynomial effect or to an incomplete adjustment of environmental effects. To do so, 10 traits including VFE in periods of 60 d were arbitrarily defined for AC and analyzed under a repeatability MTM. The variance component package VCE4.0 (Groeneveld and García-Cortés, 1998) was used for this analysis. Results from the MTM analysis are summarized in Table 4Go. Estimated heritabilities tended to be larger for the MTM than for the RRM. Heritability estimates ranged from 0.23 for measures collected between 545 and 725 d of age to 0.36 for measures collected in the fourth and eighth periods. On the other hand, repeatability estimates were within the range of RRM estimates but showed a very oscillating pattern. Residual variance estimates agreed with the estimates obtained in the RRM. Moreover, residual correlation estimates (not shown) were below 0.02, indicating that assuming independent residuals as in the proposed RRM might be realistic. Genetic and phenotypic correlation estimates were larger but described a less smooth trajectory than the estimated values under the RRM. Estimates for the genetic correlations were particularly higher ranging from 0.65 to 0.99. Large genetic correlations among measures in the studied period might be expected given that animals are measured within a relatively short period of time compared with the total lifetime. However, the permanent environmental correlations (not shown) were lower than the ones obtained in the RRM. Edge effects were not present in the MTM estimates. The relatively more sensible results of the MTM (in terms of no edge effects and larger genetic correlations) might be due to the fact that several measures are joined in the same trait, which would be somehow equivalent to "averaging" observations, accommodating the large oscillations between adjacent measures observed. However, the unsmooth patterns observed for the MTM indicate that the environmental correction might not be sufficient to adjust for the effects that cause the oscillations in phenotypic measures. Nevertheless, the nature of the environmental source of variation may be difficult to identify if it is related to specific sensitivity of individuals to stress factors that occur during the collection period.


View this table:
[in this window]
[in a new window]

 
Table 4. Estimated heritabilities (h2), repeatabilities (Rep), residual variances (ERV), and phenotypic (below diagonal) and genetic (above diagonal) correlations for volume of ejaculate at selected days of age, estimated from a multitrait repeatability animal model
 

    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Random regression models with Legendre polynomials are useful for analyzing semen characteristics because they provide information on both the level of production and its changes along time in a continuous manner. The genetic potential to produce semen changed with time, exhibiting a variety of patterns among individuals. Volume of first ejaculate showed a moderate heritability (around 0.20, ignoring edges), indicating that selection for this trait may be effective at any stage. Age at collection seems to be more adequate than days in production to describe changes in VFE along time. Legendre polynomials of order 4 or 5 for the additive genetic component and 5 or 6 for the permanent environmental effect yielded best fitting and predictive ability, and smooth trajectories for the predicted genetic values. However, care has to be taken when interpreting results at the extremes of the period. A further investigation on the environmental factors that might be causing the largely oscillating observations is needed.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
The authors thank the artificial insemination centers ABEREKIN S.A. (Derio, Spain) and CONAFE (Valdemoro, Spain) for providing performance data of Holstein bulls and pedigree records, respectively.

Received for publication February 15, 2006. Accepted for publication August 27, 2006.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 


Ducrocq, V., and P. Humblot. 1995. Genetic characteristics and evolution of semen production of young Normande bulls. Livest. Prod. Sci. 41:1–10.

García-Cortés, A., M. Rico, and E. Groeneveld. 1997. Using coupling with the Gibbs Sampler to assess convergence in animal models. J. Anim. Sci. 76:441–447.

Gelfand, A. E., D. K. Dey, and H. Chang. 1992. Model determination using predictive distributions with implementation via sampling-based methods. Pages 147–167 in Bayesian Statistics 4. J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith, ed. Oxford University Press, London, UK.

Gelman, A., B. J. Carlin, H. S. Stern, and D. B. Rubin. 1995. Bayesian data analysis. Chapman & Hall, London, UK.

Groeneveld, E., and A. García-Cortés. 1998. VCE4.0, A (co)variance component package for frequentists and Bayesians. 6th World Congr. Genet. Appl. Livest. Prod., Armidale, Australia. 27:455–456.

Henderson, C. R., Jr. 1982. Analysis of covariance in the mixed model: Higher-level non-homogeneous, and random regressions. Biometrics 38:623–640.[Medline]

Jamrozik, J., and L. R. Schaeffer. 1997. Estimation of genetic parameters for a test day model with random regressions for yield traits of first lactation Holsteins. J. Dairy Sci. 80:762–770.[Abstract/Free Full Text]

Johnson, V. E. 1996. Studying convergence of Markov chain Monte Carlo algorithms using coupled sample paths. J. Am. Stat. Assoc. 91:154–166.

Kirkpatrick, M., D. Lofsvold, and M. Bulmer. 1990. Analysis of the inheritance, selection, and evolution of growth trajectories. Genetics 124:979–993.[Abstract]

López-Romero, P., R. Rekaya, and M. J. Carabañ o. 2003. Assessment of homogeneity vs. heterogeneity of residual variance in random regression test-day models in a Bayesian analysis. J. Dairy Sci. 86:3374–3385.[Abstract/Free Full Text]

Mathevon, M., M. M. Buhr, and J. M. C. Dekkers. 1998. Environmental, management, and genetic factors affecting semen production in Holstein bulls. J. Dairy Sci. 81:3321–3330.[Abstract]

Meyer, K. 2005. Random regression analyses using B-splines to model growth of Australian Angus cattle. Genet. Sel. Evol. 37:473–500.[Medline]

Newton, M. A., and A. E. Raftery. 1994. Approximate Bayesian inference with the weighted likelihood bootstrap. J. R. Stat. Soc. Bull. 56:3–48.

Nobre, P. R. C., I. Misztal, S. Tsuruta, J. K. Bertrand, L. O. C. Silva, and P. S. Lopes. 2003. Analyses of growth curves of Nellore cattle by multiple-trait and random regression models. J. Anim. Sci. 81:918–926.[Abstract/Free Full Text]

Olori, V. E., S. Brotherstone, W. G. Hill, and B. J. Mc Guirk. 1999. Fit of standard models of the lactation curve to weekly records of milk production of cows in a single herd. Livest. Prod. Sci. 58:55–63.

Pool, M. J., and T. H. E. Meuwissen. 2000. Reduction of the number of parameters needed for a polynomial random regression test day model. Livest. Prod. Sci. 64:133–145.[Medline]

Schaeffer, L. R. 2004. Application of random regression models in animal breeding. Livest. Prod. Sci. 86:35–45.

Schaeffer, L. R., and J. C. M. Dekkers. 1994. Random regressions in animal models for test-day production in dairy cattle. Proc. 5th World Congr. Genet. Appl. Livest. Prod. 18:443–446.

Van der Werf, J. H. J., M. E. Goddard, and K. Meyer. 1998. The use of covariance functions and random regressions for genetic evaluation of milk production based on test day records. J. Dairy Sci. 81:3300–3308.[Abstract]

Weisstein, E. W. 1999. Legendre Polynomial. From Math World–A Wolfram Web Resource. http://mathworld.wolfram.com/LegendrePolynomial.html Accessed Nov. 2, 2006.



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Carabaño, M. J.
Right arrow Articles by Serrano, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Carabaño, M. J.
Right arrow Articles by Serrano, M.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS