|
|
||||||||
Department of Animal Sciences, The Ohio State University, Columbus 43210
3 E-mail:st-pierre.8{at}osu.edu
| ABSTRACT |
|---|
|
|
|---|
Key Words: pen studies experimental unit randomization causal inference
| INTRODUCTION |
|---|
|
|
|---|
Simultaneously, there appear to be greater opportunities to conduct research in large commercial herds where cows are invariably grouped in medium to large size pens. The grouping of cows during the conduct of an experiment has a significant effect on how data should be analyzed to reach valid scientific conclusions (Gill, 1987, 1989). The debate regarding proper statistical models and methods of data analyses to be used with pen studies has been marred by a lack of understanding of a few fundamental statistical concepts such as experimental units, degrees of freedom, and randomization. The frequent but erroneous notion that the smallest unit upon which a measurement is made can serve as the identifier of the experimental unit has infiltrated the discussion, although there is no theoretical basis in the statistical literature to support this position. In this paper, the statistical concepts underlying pen studies are first introduced using an intuitive approach (i.e., using examples), followed by a discussion on the fundamental statistical issues related to the analyses of such studies, including causal inference range. A few useful statistical designs that can be used with pen studies are presented, including in each case the programming statements to be used with SAS software (SAS Institute, 2004), which is the predominant software used for data analyses in papers published by the Journal of Dairy Science.
| INTUITIVE APPROACH |
|---|
|
|
|---|
Experiment 1.
The objective of the first experiment is to determine the efficacy of a new antibiotic in dairy cattle. The treatment structure is a simple one-way structure with 2 levels: a placebo and an antibiotic injection. Forty cows in 4 separate pens of 10 cows each are used to test the hypothesis. Cows are first assigned at random to each of the 4 pens. Within each pen, 5 cows are assigned at random to the placebo injection, and 5 cows are assigned to the antibiotic injection. The general structure of this experiment is shown in Figure 1
. Cows within the same pen have something more in common than cows across different pens. For example, they share the same microclimate, they are milked at the same time, they are simultaneously restrained for pregnancy check, and so on. Theoretically, there is a near-infinite list of known and unknown factors that contribute to cows within pens having more in common than cows across pens. This commonality must be accounted for during the statistical analysis. One should recognize that the experimental design used is a randomized complete block design with subsampling (RCBD) with pens acting as blocking factors (Damon and Harvey, 1987). The statistical model for observed responses is
|
![]() | [1] |
where yijk are the observed values, µ is the overall mean,
i denotes the fixed effect of the ith injection treatment, pj is the random block effect associated with the jth pen, where the block effects are assumed to be independently and identically distributed (iid) N(0,
),
pij is the random interaction effect associated with the jth pen and the ith injection treatment, assumed iid N(0,
2ap), and
ijk denotes the random error, which is assumed to be iid N(0,
2
).
Certainly, the interest is in making inference beyond the narrow range of the 4 pens used in the experiment. Pens are thus considered as a random sample from a large population of pens (i.e., levels of pen effects come from a probability distribution). Even though the variance components in this model would generally be estimated using (restricted) maximum likelihood, inference on treatment effects is generally based on classical ANOVA. It is thus useful to sketch an ANOVA table, calculate the degrees of freedom (df), and identify the correct error terms for each effect to be tested (Table 1
). In [1], the usual notation for the residual error was used. In all statistical analyses, this residual error term truly has an identity. In this first experiment, the residual error is really the nested effect of Cow (Pen x Treat) (Table 1
). Each column of cows in Figure 1
represents a level of this nested factor. There are 5 cows in each column; hence, each column has 4 df. There are 8 subclasses of Pen x Treatment. Thus, there are 8 x 4 = 32 df for Cow (Pen x Treat), a value that corresponds to the df for the error reported by statistical software (e.g., PROC MIXED; SAS Institute, 2004). The error term for Treatment, however, is not the residual error; rather, it is the interaction of Pen x Treatment (St-Pierre and Jones, 1999). The Pen x Treatment effect is not part of the design structure but is an essential part of the error structure.
|
|
![]() | [2] |
where
k denotes the fixed effect of the kth cooling treatment, pj:k is the random effect associated with the jth pen nested within the kth cooling treatment, assumed iid N(0,
2p),
i denotes the fixed effect of the ith injection treatment, 
ik is the fixed effect of the interaction between the kth cooling treatment and the ith injection treatment, and
pij:k is the random interaction effect associated with the jth pen within the kth cooling treatment and the ith injection treatment, assumed iid N(0,
2ap).
The corresponding ANOVA is presented in Table 2
. In essence, a split plot is a repeated measurement design in space (the pens are repeatedly measured in space) with a compound symmetry covariance of errors to account for the fact that cows within a pen have more in common than cows across pens (i.e., they are correlated). The split plot is the result of a merging of 2 experiments, each with its own design. In this instance, the main-plot experiment consists of 4 experimental units (the pens) assigned at random to 1 of 2 cooling treatments. The statistical design for the main plot is thus a completely randomized design. The subplot design, as explained previously, is an RCBD with subsampling, with cows nested within pens as the experimental units. As shown in Table 2
, the error term for testing the cooling treatment is Pen (Cooling). The error term for testing the injection treatment and its interaction with the cooling treatment is Injection x Pen (Cooling). The terms Pen (Cooling) and Injection x Pen (Cooling) represent the interactions between elements of the design structure and the treatment structure, which is why they are the correct error terms to be used. They form what Milliken and Johnson (1992) called the error structure of the experimental design. Some statisticians would argue that elements of the error structure can be pooled with the residual error. Yandell (1997) has explained the problems with pooling of effects. In essence, either there is too much risk in pooling (i.e., the chance of a type II error is large) or there is no benefit to it (i.e., it has little effect on the tests). Milliken and Johnson (2002) pointed out that blocking imposes a restriction on the randomization, which is something that cannot be ignored during the analysis. Because randomization was done between and within pens, the term Pen (Cooling) is an explicit component of the design structure. Thus, Pen (Cooling) should never be pooled with the residual error term. Most statisticians would also object to the pooling of Injection x Pen (Cooling) with the error term because of the potential for large type II errors in the pooling decision, and inflated (i.e., incorrect) type I error when testing the effect of Injection x Cooling.
|
Experiment 3.
This experiment is identical to experiment 2, and it is performed as such. At the conclusion of the experiment, though, the researchers realize that all the syringes contained the placebo; none contained the new antibiotic. The treatment design is thus a 2 x 1 factorial, with 2 levels of cooling and 1 level of injection. The experimental design is still a split plot (the randomization process has not been modified from experiment 2) with 2 levels of cooling as the first main factor applied to the main plot and 1 level of injection as the second factor applied to the subplot. In this situation, the design is better known as a nested design in the statistical literature. The statistical model is thus
![]() | [3] |
This model is identical to [2] except for the levels in the subscripts. The structure of the experiment is shown in Figure 3
. Because there is only 1 level of the injection treatment applied to the subplots, there are now 10 cows nested in each of the pens as opposed to the 5 cows nested within each pen and injection treatment in experiment 2. The resulting ANOVA table is presented in Table 3
. Because there is only 1 level of the injection treatment, there are zero df associated with Injection and all interaction effects of which it is a component. Thus, the terms Injection, Injection x Cooling, and Injection x Pen (Cooling) are removed from the model because they are not identifiable (i.e., 0 df). The Pen (Cooling) is still the correct error term for testing the effect of Cooling. The significance of Pen (Cooling) can be assessed using the residual error, which is really Cow (Cooling x Pen). However, the Pen (Cooling) term cannot be pooled with the residual error because it is an explicit element of the design structure. The analysis must follow the randomization. Intuitively, it should be apparent that the test for the cooling effect (the main plot factor) should not be changed because the subplots contained only 1 level of injection as opposed to 2. Randomization has not changed.
|
|
| ISSUES WITH PEN STUDIES |
|---|
|
|
|---|
Randomization.
Randomization plays a critical role in controlled experiments. As stated by Fisher (1935), randomization forms the "reasoned basis for inference" in experiments, and that the same inference would not be justified from identical data obtained in nonrandomized studies. Randomization is a concept that often has been misunderstood and, thus, deserves further clarification.
A process is random if it is without definite method or purpose; if it is unsystematic (van Belle, 2002). Randomization is important for 3 reasons (van Belle, 2002). First, randomization turns uncontrolled systematic effects into errors. Second, there is an expected balance (expected is taken in its mathematical sense) in the assignment of known and unknown factors that might influence the outcome. Perhaps more importantly, randomization provides the foundational basis for statistical procedures such as tests of significance. For example, in the following GLM
![]() | [4] |
where Y is a vector of observations, X is the design matrix, ß are population parameters to be estimated, and
is a vector of errors, standard tests on the ß estimates require either that the elements of
are independently distributed, or that their dependency be accounted for in the error structure of the model through modelization of the covariance matrix of errors. In either case, randomization is essential for having accurate probability assessments (van Belle, 2002; Rubin, 2005). A simple example illustrating the critical importance of either independence or knowledge about the dependency can be easily constructed using 2 dice. The probability of getting a five on a single roll of a fair die is 1/6. If a second independent die is rolled, the probability of getting a five on both dice is simply 1/6 x 1/6 = 1/36. This calculation rests entirely on the assumption that the 2 dice are independent. If, however, the 2 dice are glued together, the probability of getting 2 fives is no longer 1/36, but depends on how the dice were glued. If 1 of the 4 rectangular faces made by the 2 glued dice has 2 fives, the probability of getting 2 fives on a roll is ¹/3. However, this probability is zero if none of the 4 faces has a double five. Thus, the structure of the dependence must be known to calculate the correct probability. It must also be noted that the probabilities under the lack of independence (P = 0.25 or P = 0.0) are both considerably different from the probability under independence (P = 0.028). Using the latter would be erroneous regardless of the true dependence structure.
Randomization requires considerably more than simply assigning units to treatments in a random fashion. This is a necessary condition to randomization, but it is not sufficient. All uncontrolled factors must be randomized. For example, simply assigning 20 cows to a control and 20 cows to a treatment does not ensure independent errors unless all other factors are randomized across all 40 cows. This would not be the case if the control cows were housed in one state and the treatment cows in another state, or if the control cows were all housed on the south side of the barn whereas the treatment cows were all housed on the north side, or if the control cows were all in the same pens whereas the treatment cows were all in other pens. The importance of randomization was well worded by Sir Ronald A. Fisher when he stated that "Designing an experiment is like gambling with the devil: only a random strategy can defeat all his gambling systems" (Box et al., 2005).
Inaccurate Probability of Treatment Effects
This problem is essentially one of inflated type I error, a consequence of pseudoreplication and incorrect df for the test statistic. The term pseudoreplication, first used by Hurlbert (1984), is defined as a lack of statistical independence among measurements. It results in an assignment of treatment effects with an error term inappropriate to the hypothesis being tested (van Belle, 2002). Pseudoreplication can be conceptualized intuitively. For example, one can easily understand that measuring hourly ozone levels for 24 h is not the same as measuring ozone levels 1 h per day for 24 d, or measuring 1-h ozone levels at 24 locations.
The concept of df is central to statistical inference theory, yet it is seldom defined in statistical textbooks. Standard statistical tests estimate the probability of all outcomes more extreme than that actually observed in an experiment to occur under the assumption that the null hypothesis of no treatment effect is true. Because many parameters are simultaneously estimated from a given set of data, the df represent the number of independent pieces of information available for a given estimate, or a given test. It expresses the ability of the system to wiggle in a multidimensional hyperspace. In a planar, 2-dimensional world, a 2-legged stool has no df because it can always be perfectly set on any line in the plane, regardless of the shape of the line. Likewise, a 3-legged stool has no df in a 3-dimensional world; the 3 legs can perfectly rest on any 3-dimensional surface. Expanding the analogy, counting df consists in counting the number of dimensions in the data and the number of legs in the model. Using cows as the experimental units in a pen study overestimates the number of dimensions, or the ability of the system to wiggle in the hyperspace. In such instances, the df of the error are grossly overestimated, with the consequence that the type I error is severely underestimated.
Problems of Causal Inferences
It is one thing to say that a group of cows had performance that differed from that of another group; however, it is an entirely different matter to assign a cause to that difference. The issue of causality has long been a controversial issue in statistics, leading at times to heated exchanges such as those among Fisher, Pearson, and Neyman (Rubin, 2006). Recently, causality was brought into a coherent theory for both experimental and observational studies (Cochran, 1968; Rubin, 1974; Holland, 1986; Reiter, 2000). The seminal work of Donald Rubin at Harvard University is particularly noteworthy and will be emphasized in this exposé.
The Rubin causal model (RCM) perspective for statistical inference for causal effects is founded on 5 primitives: unit, treatment, potential outcomes, causal effects, and fundamental problem of inference (Rubin, 2005). A unit is defined as a person, place, or thing upon which a treatment operates at a particular time. A treatment is defined as an intervention, the effects of which, on some particular measurements of the units, the investigator wishes to assess relative to no intervention (i.e., the "control"). For simplicity, our discussion will focus entirely on the simple case of a treatment vs. control situation, understanding that the theory and method do extend to cases of multiple treatments. Potential outcomes are then defined as the values of a units measurement after a) application of the treatment and b) nonapplication of the treatment (i.e., under control). The causal effect is then simply for each unit the comparison of the potential outcome under treatment and the potential outcome under control. This development leads to the fundamental problem of inference: we can observe at most one of the potential outcomes for each unit. Resolving this fundamental problem in a statistical sense requires a) replications, b) an assumption regarding the unit-treatment value, and c) an assignment mechanism. Replication implies that at least 1 unit receives the treatment and at least 1 unit receives the control (Rubin, 2005). The stable unit-treatment value assumption (SUTVA) has 2 parts: a) there is only 1 form of the treatment and 1 form of the control, and b) there is no interference among units. The assignment mechanism is simply defined as the process for deciding which units receive the treatment and which receive the control. Under SUTVA and known assignment mechanisms, the unit-level causal effect can never be observed, but it can be estimated because we have replication. The assignment mechanism determines which potential outcome we will observe for each unit. The assignment mechanism is critical even if SUTVA holds. It is essential to know or be able to infer a rule for how each unit received either the treatment or the control. A stochastic unconfounded assignment mechanism is one in which the assignment of treatment or control for all units is independent of all potential outcomes, observed and unobserved, and one in which the assignment is probabilistic. In essence, the assignment mechanism allows us to use the observations from the units assigned to the control as proxies for the unobservable potential outcomes under nonapplication of the treatment of the units assigned to the treatment. This is statistically tenable if the propensity score, defined as the probability of a unit to be assigned to treatment, is the same for the units in the treatment as well as those in the control. In a completely randomized design, randomization ensures that the propensity score is the same for all units. In a randomized block design, randomization ensures that the propensity score is the same for all units in the same block. It is the process of randomization; that is, the absence of a systematic allocation of units to treatment, that allows causal inference.
We are now in a position to understand the fundamental problem of causal inference in pen studies. We start with the simple example of 2 pens each of n cows, with the first pen assigned to the control and the second pen assigned to the treatment. The propensity score for all cows in the first pen is zero, whereas the propensity score of all cows in the second pen is 1. That is, if we know the pen, then we know the propensity score of the cows within that pen. In this situation, it is impossible for any of the cows in the second pen (the treatment), to be matched with a cow in the first pen with the same propensity score. Thus, we have no way of estimating the potential outcome under the nonapplication of the treatment of the cows under the application of the treatment. In such instances, causal inference is impossible. All that can be said is that this set of cows out produced this other set of cows. One cannot assess the role that random chance could have played in this outcome, nor identify the cause for such difference.
We can apply the causal inference principles to a pen study with 4 pens, each of n animals. The pens are randomly assigned to the control and treatment. For illustration purpose, assume that one outcome of this randomization is to assign pens 1 and 3 to control and pens 2 and 4 to treatment. The propensity score of all cows in pens 1 and 3 is zero, whereas the propensity score of all cows in pens 2 and 4 is 1. It is impossible to match a cow in pen 2 (treatment) with a cow with the same propensity score in the control from either pen 1 or 3. Thus, we cannot make causal inference using cows as units. However, the propensity score for the 4 pens is the same, 0.5, because of the randomized assignment of pens to each of the 2 treatments. Thus, one can use either the measurements on pen 1 or pen 3 as a control match to pen 2 or pen 4. Pens are legitimate units for causal inference of the treatment. In fact, some nonparametric statistical tests are based on the computation of all pen permutations under the null hypothesis.
Optimal Pen Size
The issue of determining the optimal pen size is identical to that of determining an optimal plot size in agronomy. Plants within a plot are sampling units just like cows within a pen are also sampling units. Thus, not all cows in a pen need to be measured, although the relative cost of sampling (i.e., measuring the dependent variable on each cow) is generally so much lower than that of a cow, that the economically optimal design involves the sampling of all sampling units (i.e., all cows).
The optimum pen size is dependent, among other factors, on the size of intraexperimental and interexperimental unit competition (Federer, 1955). Intraexperimental unit competition exists when cows within the same pen affect each other either advantageously or detrimentally, something that we know happens based on animal behavior studies. This is a fundamental characteristic of cows housed in the same pen. Suppose that we have h cows per pen, and that we measure (sample) k cows per pen. In the absence of intraexperimental unit competition, the variance of the mean of the k sampling units is simply Vp + (Vs/k), where Vp represents the variation among pens treated similarly, and Vs is the variance among sampling units (the cows) within the pen. But in the presence of competition between cows within pens, the variance due to this competition (Vc) must be accounted for, and the variance among experimental units then becomes (Federer, 1955)
![]() | [5] |
where
and V's = Vs + Vc.
If all the variation within the pen is due to competition, then [5] reduces to V'p + Vc/k. With no competition or if all cows in the pens are sampled and the sum of the competition effects, ci, adds to zero within pen (a weak assumption), then [5] reduces to Vp + (Vs/k). Thus, as long as all cows in a pen are participating in the experiment and are being sampled, the variance due to competition is not a factor in determining the optimal pen size. In this instance, Vc is totally confounded within Vs, so the variance of cows within pen includes all of the variance due to competition. The values of Vp and Vs associated with a particular experiment are completely situation-dependent. Increased uniformity of pens decreases Vp, and increased uniformity of cows within pens decreases Vs. Prior or estimated values of these variances in combination with the cost per cow and the cost per pen can then be used to estimate an optimal pen size, using, for example, the Fairfield Smiths variance law (Federer, 1955).
From Eq. [5], it should also be apparent that, unless the variation between pens is relatively large compared with the variation between cows or if the variance due to competition is relatively large, the number of replicates to achieve a given power with pen studies can be substantially less than the number of replicates required when cows are the experimental units. The number of experimental units (pens) required for a given power is function of the variation among experimental units. In Eq. [5], Vp represents the variance due to pen after accounting for the variance due to cows forming the pens. The term "pen" implies a grouping of cows but does not necessarily imply a conventional pen found on farms. In fact, "pens" could actually be "farms". That is, one could use farms as experimental units and randomly assign them to the various treatments. In this instance, Vp would likely be large, and a relatively large number of replicates would be required for a given power. Likewise, pens on commercial farms may not be very uniform (i.e., animals are penned according to parity, production levels, pregnancy status, health status, etc.), resulting in large variance between pens. In such instances, a greater number of replications than with uniform pens are required, unless some form of switchback design is used, as explained in the next section.
| STATISTICAL DESIGNS FOR PEN STUDIES |
|---|
|
|
|---|
Completely Randomized Design
This is the simplest and least powerful design, but it also has the fewest assumptions. Cows are randomized across pens, and pens are randomized across treatments. The completely randomized design (CRD) requires a minimum of 1 pen per treatment, plus 1 additional pen for 1 of the treatments, a situation that results in very low power because the error has only 1 df. The number of pens required for a desired power is dependent on the significance level of the tests (i.e., the desired size of type I error), the size of the expected treatment differences, and the variance among pens. With this design, the researcher should form pens that are as much alike as possible. The variation among cows within each pen is not very relevant to the analysis unless one suspects an interaction between the initial level of production and the response to treatments. Initial (covariate) measurements on the cows as well as repeated measures can easily be incorporated in the model, as explained later. For now, we shall only consider the case of a single measurement (or a mean of measurements) for each cow.
The statistical model underlying the analysis is
![]() | [6] |
where yijk are the observed values, µ is the overall population mean,
i is the effect of the ith treatment, pj:i is the random effect of the jth pen within the ith treatment, and
ijk is the random error, assumed iid N(0, 
2).
In [6], the
ijk is actually the effect of cow nested within treatment x pen. The schematic ANOVA table for this model is shown in Table 4
. A data record consists of 1 observation for each cow used in the experiment. This model is easily fitted with the MIXED procedure of SAS (SAS Institute, 2004) using the following statements:
|
![]() | [7] |
RCBD
Frequently, the number of pens on a given farm is insufficient to reach a satisfactory power using a CRD. An easy solution in such instances is to use a multisite design (i.e., conduct the experiment across many farms) in what is known as a RCBD. In such case, farms act as blocking factors whose effects can be considered either fixed or random, depending on the process used for selecting the farms, the inference range, and the number of farms used in the experiment. Theoretical considerations and guidelines for selecting the type of effects for blocks are found in McCulloch and Searle (2001). If the effect of farm is considered fixed, the statistical model that underlies the analysis is
![]() | [8] |
where yijk are the observed values,
i is the fixed effect of the ith farm,
j is the fixed effect of the jth treatment, 
ij is the interaction effect between the ith farm and the jth treatment, pk:ij is the random effect of the kth pen within the ith farm and jth treatment, assumed iid N(0,
), and
ijkl is the random error, assumed iid N(0, 
2).
In [8], the
ijkl is actually the effect of cow nested within farm, treatment, and pen. The schematic ANOVA for this model is shown in Table 5
. In this instance, the Pen (Farm x Treatment) is the correct error term for testing the treatment effect. Model [8] can be fitted using the MIXED procedure of SAS with the following statements:
|
![]() | [9] |
If the effect of farm is considered random, the model that underlies the analysis is
![]() | [10] |
where yijk are the observed values, fi is the random effect of the ith farm, iid N(0,
),
j is the fixed effect of the jth treatment, f
ij is the random interaction effect between the ith farm and the jth treatment, iid N(0,
f
2), pk:ij is the random effect of the kth pen within the ith farm and jth treatment, iid N(0,
), and
ijkl is the random error, assumed iid N(0, 
2).
Model [10] is solved using the following statements with PROC MIXED of SAS:
![]() | [11] |
In [11] the proper error term for testing the effect of treatment is Farm x Treatment (Table 5
). In this design as well as in the previous one, it is in the researchers interest to minimize the variance across pens. Thus, one would want to have pens within farms as uniform as possible to maximize the power of the test.
Switchback Designs
In instances where it is difficult to assemble a large number of uniform pens, designs in which the pen acts as its own control, such as Latin squares and switchback designs, can prove to be considerably more powerful than the first 2 classes of designs previously outlined. This gain in power is achieved through additional assumptions, such as the implicit pooling of interaction terms with the error (Cochran and Cox, 1957), which can be a possible drawback.
Figure 4
shows an example of an experiment involving 10 pens, 2 treatments, and 3 periods in a switchback design. Notice that the order in which the 2 treatments are applied can follow 2 distinctive sequences. The statistical model underlying the analysis is
|
![]() | [12] |
where yijklm are the observed values,
i is the fixed effect of the ith sequence, pj:i is the random effect of the jth pen with the ith sequence, iid N(0,
),
k is the fixed effect of the kth period,
l is the fixed effect of the lth treatment, 
il is the fixed interaction effect of the ith sequence with the lth treatment, 
kl is the fixed interaction effect of the kth period with the lth treatment, p
jkl:i is the random interaction effect between the jth pen with the kth period and the lth treatment within the ith sequence, iid N(0,
p
2), and
ijklm is the random error, assumed iid N(0, 
2).
In [12], the
ijklm are in fact the Period x Cow (Pen Sequence). This model would be solved using PROC MIXED of SAS with the following statements:
![]() | [13] |
In [13], the error term used for testing the effect of treatment is period*treat*pen(sequence) and not the pen(sequence). Consequently, the experimental unit for the treatment is no longer simply a pen, but a pen-period. Each pen serves as its own control. This is important because this implies that having uniformity across pens is not an attribute important for testing the effect of treatments, a feature that was important in the CRD and RCBD.
Covariate and Repeated Measures
The designs and their associated SAS statements that were previously described were based on data records for individual cows. If pens are approximately all of the same size, and if there is no change in animal numbers during the study (i.e., no cow leaves or is brought into an experimental pen), the observations could be first averaged within pens and these averages used as records for statistical analyses. In which case, the pen is explicitly the experimental unit. In general, however, it is preferable to use individual cow records because of the ease of incorporation of covariates in the analysis, and the correct use of partial records. Likewise, repeated measurements on the cows are easily incorporated in the analysis.
To understand how covariates and repeated measures are incorporated in the statistical analysis of pen studies, we reuse the example from our virtual experiment 2, but this time we make use of production measurements done immediately before the assignment of cows to pens (i.e., a covariate measurement). In addition, we are now told that production was measured weekly for 4 wk. The model used for the statistical analysis is
![]() | [14] |
where
k denotes the fixed effect of the kth cooling treatment; pj:k is the random effect associated with the jth pen nested within the kth cooling treatment, assumed iid N(0,
);
i denotes the fixed effect of the ith injection treatment; 
ik is the fixed effect of the interaction between the kth cooling treatment and the ith injection treatment;
pj:ik is the random interaction effect associated with the jth pen within the kth cooling treatment and the ith injection treatment, assumed iid N(0,
ap2); ßXijkl is the covariate adjustment for each cow; cl:ijk is the random effect of the lth cow within the kth cooling treatment, ith injection treatment, and jth pen, assumed iid N(0,
);
m is the fixed effect of the mth week; 
mk, 
mi, and 

mki are the fixed effects of interaction terms;
pmj:k is the random interaction effect associated with the mth week and the jth pen within the kth cooling treatment, assumed iid N(0,
p
2); 
pmij:k is the random interaction effect associated with the mth week, the ith injection, and the jth pen within the kth cooling treatment, assumed iid N(0,
p
2); and
ijklm denotes the random error, which are no longer assumed independent because of the repeated measurements, but are assumed identically distributed N(0,
).
In essence, model [14] is a split-split plot in time, with the second split taken loosely because unlike a true split-plot where the subplots are fully randomized, the sub-plots of a repeated measures design cannot be randomized (i.e., wk 3 always follows wk 2). The ANOVA table and the corresponding df are shown in Table 6
, along with the 3 levels in the design structure (plots) and their associated experimental units (error terms).
|
|
![]() | [15] |
where all the terms are defined as in [14] with obvious changes in the subscripts. This model can be fitted using the following SAS statements:
![]() | [16] |
Note that the term cow(cooling*pen) appears in both the RANDOM and the REPEATED statements. In fact, this is dependent on the type of covariance structure chosen. In some structures (e.g., unstructured, UN, or compound symmetry, CS) the 2 are completely redundant, leading to a failure of PROC MIXED to identify a solution. This is not a problem per se, as it reflects a set of statements that, in essence, over-parameterize the model (Littell et al., 2006). Although it is theoretically possible to have the term cow(cooling*pen) in both the RANDOM and REPEATED statements with an autoregressive structure [AR(1)], the algorithm frequently experiences convergence problems, and the term must be removed from the RANDOM statement without any changes to the test of the fixed effects of interest, but a different interpretation to the estimate of the cow(cooling*pen) component of variance. Additionally, it should be noted that in all sets of SAS statements presented in this article, we did not include any option to correct the df to account for the uncertainty in estimating the G and R matrices of the mixed models equations (SAS Institute, 2004). Corrections to the df are generally very small for data sets that are relatively well balanced. In cases of markedly unbalanced data (i.e., different number of cows per pen or differing number of pens per treatment), it is generally preferable to compute inflation factors along with Satterthwaite-based degrees of freedom (SAS Institute, 2004). In such instances, the MODEL statement in [15] would include the following option:
![]() | [17] |
The advantages of using individual records for each cow in the analysis should now be apparent. As long as cows do not leave the experiment for reasons related to the treatments, removing or moving a cow into an experimental pen during the conduct of the experiment only leads to imbalance in the data (i.e., missing observations on some cows), without any consequence on the tests of interest. Additionally, covariance corrections can be applied directly at the cow level. Other correction factors, such as parity, can be introduced in the subplot as long as cows of different parities were not completely separated into different pens. These corrections to the cow records lead to reduced between-pen variance, something that is desirable with longitudinal design in which the pen does not serve as its own control.
| CONCLUSIONS |
|---|
|
|
|---|
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
2 Salaries and research support were provided by state and federal funds appropriated to the Ohio Agricultural Research and Development Center, the Ohio State University. Manuscript No. 02-07AS. ![]()
Received for publication September 19, 2006. Accepted for publication April 3, 2007.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
L. A. Gonzalez, A. Ferret, X. Manteca, J. L. Ruiz-de-la-Torre, S. Calsamiglia, M. Devant, and A. Bach Performance, behavior, and welfare of Friesian heifers housed in pens with two, four, and eight individuals per concentrate feeding place J Anim Sci, June 1, 2008; 86(6): 1446 - 1458. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |