|
|
||||||||

* Institut National de la Recherche Agronomique, UR631 Station dAmélioration Génétique des Animaux, BP 52627, 32326 Castanet-Tolosan, France
Department of Animal and Dairy Science, University of Georgia, Athens 30602-2771
1 Corresponding author: andres.legarra{at}toulouse.inra.fr
| ABSTRACT |
|---|
|
|
|---|
Key Words: genome-wide selection genomic selection genetic evaluation marker-assisted selection
| INTRODUCTION |
|---|
|
|
|---|
Traditional animal models have a large number of unknowns, up to tens of millions, but the number of effects is small and the system of equations is sparse. With genome-wide selection, the number of unknowns is much smaller, in the order of the number of SNP, but the number of effects is similar to the number of SNP, and the system of equations is dense. The purpose of this study was to evaluate a number of computing options useful for genome-wide selection with SNP or combinations of SNP. This study also develops the idea of Janss and de Jong (1999) to use model residuals in iterative methods and test this in genome-wide evaluation.
| MATERIALS AND METHODS |
|---|
|
|
|---|
to allele 1 and the value
to allele 2. This follows a classical parameterization in which aj is half the difference between the 2 homozygotes (e.g., Lynch and Walsh 1998). Therefore, the effects of the different genotypes are +a for 11, 0 for 12, and –a for 22. The effects of the SNP at the n loci would sum to form the genetic effect; the model for the phenotype (ignoring other effects for the sake of clarity) is
![]() |
where yi is the phenotype of the ith animal (i = 1, m), xi,j is an indicator covariate for the ith animal and the jth SNP locus (j = 1, n), and aj is twice the effect of allele 1 at the jth SNP. In matrix algebra, assuming no other effect,
![]() |
Mixed-model equations in the single trait case are
![]() |
where D is a normalized diagonal matrix,
a2 is an average variance of SNP effects,
2e is residual variance, and
. Meuwissen et al. (2001), for the case of multiple alleles in each locus (e.g., microsatellites), considered 3 models: a fixed model for SNP effects (
= 0), a model with equal variance for each SNP effect (D = I), as above, and a model with variable variance per SNP. Their fixed model showed severe problems of collinearity that were alleviated in the random models. The same models were applied by Solberg et al. (2006) to SNP. For the following, we will assume D = I.
Sparsity Pattern.
The model above has n effects. The design vector for each observation contains zeros for heterozygotes, and nonzeros for homozygotes. In the best case (equal allele frequencies), 50% of the elements of the design vector are nonzero. Moreover, unless there is complete linkage disequilibrium among 2 loci, the corresponding out-of-diagonal elements in the X'X matrix will be nonzero. Subsequently, the left-hand side (LHS) of the mixed-model equations would be nearly full. This is in contrast with a polygenic model, where the design vector contains only a few nonzeros and LHS would usually have fewer than 100 nonzeros per row, even for millions of equations. Other models for genome-wide selection (e.g., combinations of SNP) will show similar problems (as discussed in this paper) because of the high number of effects.
Computing Options
A computing methodology to solve the equations would require 2 types of general choices. The first one regards the type of storage. Options include the storage of the LHS, or the use of matrix-free methods, otherwise known as iteration on data. The second choice is regarding an algorithm used for solving.
Assuming that LHS is dense, the amount of half-stored memory is approximately n2/2 elements. In the case of single-precision storage, this would require 2n2 bytes. With 10,000 SNP, this is 200 megabytes. Such storage is available on current computers. The setting up of LHS would require at least mn2/2 operations. With 10,000 observations and 10,000 SNP, and assuming a computer with 1,000 MFLOPS (millions of floating point operations per second), this would require approximately 8 min of computing. The storage would increase quadratically with the number of SNP but is unaffected by the number of observations.
In matrix-free methods, one would store only X and then compute functions of LHS by using operations on X. The number of elements in X, assuming dense storage, is nm elements. Each element has 3 values, –1, 0, or 1, and can be stored in 2 bits. Thus, storage of X would require nm/4 bytes if stored efficiently or nm bytes if stored 1 element per byte, which is more convenient computationally because 1-byte integers are standard in computer languages, whereas 2-bit ones are not. The storage for the test case would be 25 megabytes with the most efficient storage and 100 megabytes with 1 byte of storage. In matrix-free methods, the storage increases linearly with the number of SNP and the number of records.
With regard to computing algorithms, 3 choices can be considered. The first one is a finite solving, for example, based on Cholesky decomposition or QR factorization. Cholesky decomposition requires approximately n3/3 operations, or 6 min for the case above. A special concern would be stability, because the system of equations arising from many SNP is likely to be almost singular, and automatic detection of redundant equations through "numeric zeros" could be imperfect. The second algorithm is iterative by preconditioned conjugate gradients (PCG), which is currently a method of choice in animal breeding applications. This algorithm can be implemented very efficiently in matrix-free algorithms (Lidauer et al., 1999; Tsuruta et al., 2001). A disadvantage of this algorithm is the accumulation of errors, which may cause stability problems and require periodic restarts. The last algorithm is the Gauss-Seidel iteration (GS), which, after a small modification, becomes successive overrelaxation. Although, in general, it is not as fast as PCG, it has good numerical properties. An additional benefit of GS is that it can easily be converted to a Gibbs sampler. The implementation of GS with LHS stored is straightforward. A matrix-free implementation requires that for each effect in the model, the right-hand sides be adjusted for all the remaining effects. Possible strategies were described by Schaeffer and Kennedy (1986) and Misztal and Gianola (1987). For a very high number of effects, as is the case, both strategies are very expensive. A matrix-free GS was used by Meuwissen et al. (2001), who pointed out that, in this approach, "computer time increases quadratically with the number of effects fitted." This is because of the necessity of adjusting the right-hand sides for all other effects. Janss and de Jong (1999) used a modification to matrix-free GS based on adjustment of the residuals, where the cost of GS was less dependent on the number of effects. We call this approach the Gauss-Seidel with residual update (GSRU).
Strategies to Solve Mixed-Model Equations
Cholesky Decomposition.
The Cholesky decomposition of the LHS matrix is LL' = [X'X + D
]. This is followed by twice solving the triangular systems of equations Lb = X'y and L'â = b.
GS.
Two regular GS algorithms were used in the comparison of methods. The first one (LHS-GS) stored the LHS in memory by using a half-stored matrix. The second one (matrix-free GS) used iteration on data. Let xj denote the jth column of X. In the latter algorithm, the l + 1 iteration for the jth element in a is as follows:
![]() |
For each unknown, there are approximately mn products in computing Xa, m sums in y – Xa, and m products in x'j (y – ···).
GSRU.
Janss and de Jong (1999) noticed that y corrected for all effects, except the jth effect, is equal to the current vector of residuals, el+1,j, plus the current estimate of the jth effect, as follows:
![]() |
Then
![]() | [1] |
Updating of e is necessary after computation of each effect. Updating after the l + 1 computation of the jth element is done as follows:
![]() | [2] |
Equations [1] and [2] fully describe one iteration of the GSRU algorithm for equation j.
Extension for the multivariate case is straightforward. Because e is formed by successive computations, there may be some accumulation of numerical errors. Therefore, e may have to be recomputed after a certain number of iterations.
Cross-products x'jxj are constants and can be precomputed. Then the only computationally intensive operations are the vector products x'jel, xiâil, and updating the residuals. Therefore, for each of the n unknowns, there are m multiplications in equation [1] (vector-vector multiplication), m multiplications in equation [2] (vector-scalar multiplication), and m sums in equation [2] (sum of vectors). Pseudocode for the GSRU is shown in the Appendix.
PCG.
This strategy has been described extensively elsewhere (Strandén and Lidauer, 1999; Tsuruta et al., 2001). In this strategy, the only computation involving the LHS is a matrix vector product: [X'X + D
]q, where q is a vector. Computing D
q is simple for diagonal D, but computing X'Xq is computationally demanding. Let x'i be the ith row of X (i.e., a row vector). Assume that
![]() |
X'Xq can be computed straightforwardly as
![]() | [3] |
at a cost of nz2 operations per record, where nz is the number of nonzeroes in xi, or optimized with a different order,
![]() | [4] |
at a cost of only 3nz operations (2 for the products and 1 for the cumulation in X'Xq; Strandén and Lidauer, 1999).
Comparison Among Solving Strategies
Number of Operations.
Solving by Cholesky decomposition involves n3/3 operations. For iterative methods, one round of iteration involves n2 operations for LHS-GS and 2mn + mn2 for matrix-free GS. With the matrix-free GSRU algorithm, there are approximately 3nm operations per round. For PCG, the number of operations per round is approximately mn + n2 by using equation [3], and reduces to approximately 3nm by using equation [4]. Note that some of the operations involve sums but others involve multiplication, with different computing times.
Variance Component Estimation.
The extension of GSRU or GS to variance component estimation by Markov chain-Monte Carlo is simple. Gauss-Seidel and Gibbs sampling are closely connected (Galli and Gao, 2001; Sorensen and Gianola, 2002). A sample ãj of the posterior distribution of aj, in the l + 1 iteration, is drawn from a normal distribution as follows:
![]() |
The posterior distribution of the variance components can then be sampled by regular Gibbs sampling (e.g., Sorensen and Gianola, 2002) by using the cross-products ã 'ã . Solving by PCG does not lead by itself to simple variance component estimation unless multivariate sampling is used (e.g., García-Cortés and Sorensen, 1996).
Performance.
All algorithms presented here (Cholesky decomposition, LHS-GS, matrix-free GS, GSRU, and PCG) were tested with real data. The data set comprised 10,946 SNP and 1,928 weight records from mice (Valdar et al., 2006; Legarra et al., 2007). The model fit included a general mean and gender as fixed effects, a random additive effect a for each SNP locus, and a random dominant effect d for each SNP locus attributable to genotype 12. An equal variance
a2 was assumed for all additive effects, and an equal variance
d2 was assumed for all dominant effects (this corresponds to the BLUP strategy of Meuwissen et al., 2001). The number of unknowns was 21,895. Mixed-model equations were not full rank. The convergence criterion was based on differences between consecutive solutions (Lidauer et al., 1999). Two different convergence criteria were tried: 10–10 and 10–14, because algorithms might behave differently for different convergence criteria.
Programs were written in Fortran 95 and were run on a workstation with a 2.5 GHz processor running Linux. Cholesky decomposition and LHS-GS were implemented by using a single-precision half-stored matrix for the LHS. Implementations of GS, GSRU, and PCG were matrix-free, but matrices X and y were stored in memory. Preconditioned conjugate gradients used a diagonal preconditioner, and computing of products X'Xq used equation [4]. The vector of residuals e was recomputed every 100 iterations in GSRU and PCG. No such update was needed in the other algorithms.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
|
Memory requirements were similar for all matrix-free methods, which required 200 megabytes, whereas Cholesky decomposition and LHS-GS required 1,114 megabytes (with LHS half-stored in single precision; storage in double precision was not possible).
Different data sets may slightly change the relative performance of the methods. Data structure (condition number and eigenvalues of the LHS matrix) affect convergence of both the GS and PCG methods. Time to solve is proportional to the number of records in the matrix-free methods but is constant in Cholesky decomposition or LHS-GS. In addition, the memory requirements increase linearly with the number of SNP in matrix-free methods, but the increase is quadratic with Cholesky decomposition and LHS-GS. When the number of records is large and the number of SNP reduced (e.g., by preselection), Cholesky decomposition or LHS-GS may become a viable option.
| APPENDIX |
|---|
|
|
|---|
Double precision:: xpx(neq), y(ndata), e(ndata), X(ndata, neq), & sol(neq), lambda, lhs, rhs, val
do i=1, neq
xpx(i)=dot_product(X(:, i), X(:, i)) !form diagonal of X'X
enddo
lambda=vare/vara
e=y
do until convergence
do i=1, neq
!form lhs
lhs=xpx(i)+lambda
! form rhs with y corrected by other effects (formula 1)
rhs=dot_product(X(:, i), e) +xpx(i) *sol(i)
! do Gauss Seidel
val=rhs/lhs
! MCMC sample solution from its conditional (commented out here)
! val=normal(rhs/lhs,1d0/lhs)
! update e with current estimate (formula 2)
e=e – X(:, i)*(val–sol(i))
!update sol
sol(i)=val
enddo
enddo
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Received for publication June 1, 2007. Accepted for publication September 5, 2007.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
Z. Zhang, E. S. Buckler, T. M. Casstevens, and P. J. Bradbury Software engineering the mixed model for genome-wide association studies on large samples Brief Bioinform, November 1, 2009; 10(6): 664 - 675. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Misztal, A. Legarra, and I. Aguilar Computing procedures for genetic evaluation including phenotypic, full pedigree, and genomic information J Dairy Sci, September 1, 2009; 92(9): 4648 - 4655. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. J. Hayes, P. J. Bowman, A. J. Chamberlain, and M. E. Goddard Invited review: Genomic selection in dairy cattle: Progress and challenges J Dairy Sci, February 1, 2009; 92(2): 433 - 443. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. M. VanRaden Efficient Methods to Compute Genomic Predictions J Dairy Sci, November 1, 2008; 91(11): 4414 - 4423. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Legarra, C. Robert-Granie, E. Manfredi, and J.-M. Elsen Performance of Genomic Selection in Mice Genetics, September 1, 2008; 180(1): 611 - 618. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Tsuruta and I. Misztal Technical note: Computing options for genetic evaluation with a large number of genetic markers J Anim Sci, July 1, 2008; 86(7): 1514 - 1518. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |