JDS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J. Dairy Sci. 2008. 91:2361-2369. doi:10.3168/jds.2008-0985
© 2008 American Dairy Science Association ®

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Nie, Z.
Right arrow Articles by Liu, X.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nie, Z.
Right arrow Articles by Liu, X.

Hot Topic: Application of Support Vector Machine Method in Prediction of Alfalfa Protein Fractions by Near Infrared Reflectance Spectroscopy

Z. Nie*, J. Han*,1, T. Liu{dagger} and X. Liu{ddagger}

* Department of Grassland Science, College of Animal Science and Technology, China Agricultural University, Beijing 10094, China
{dagger} Beijing Petrochemical Design Institute, Beijing 100101, China
{ddagger} Department of Chemistry, College of Sciences, Shanghai University, Shanghai 200444, China

1 Corresponding author: jianguohan2058{at}126.com


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
The object of this study was to explore the potential for support vector machine (SVM) to improve the precision of predicting protein fractions by near infrared reflectance spectroscopy (NIRS). Generally, most protein fractions determined in Cornell Net Carbohydrate and Protein System (CNCPS), especially the neutral detergent insoluble protein (NDFCP) and acid detergent insoluble protein (ADFCP), could not be accurately predicted by the commonly used partial least squares (PLS) method. A recently developed chemometric method, SVM, was applied in NIRS prediction of alfalfa protein fractions in this study. Two hundred thirty alfalfa samples were scanned on a near infrared reflectance spectrophotometer, and analyzed for crude protein (CP), true protein precipitated in tungstic acid (TCP), borate-phosphate buffer–insoluble protein (BICP), NDFCP, and ADFCP. These 5 laboratory proteins and the CNCPS protein fractions A, B1, B2, B3, and C were predicted by NIRS using the PLS and SVM methods. According to PLS-NIRS regression, CP, TCP, BICP, A, and B2 obtained the determination coefficient of prediction (Formula) of 0.96, 0.91, 0.94, 0.94, and 0.93, and the ratios of standard deviation of prediction samples: standard error of prediction samples (RPD) values were 5.07, 3.31, 3.98, 3.96, and 3.91. Neutral detergent insoluble protein, ADFCP (fraction C), B1, and B3 were predicted with Formula of 0.75, 0.83, 0.30, and 0.62, and RPD values of 1.98, 2.42, 1.20, and 1.62; Calibrated by the SVM-NIRS method, Formula values of CP, TCP, BICP, NDFCP, ADFCP(C), A, and B2 achieved 0.99, 0.97, 0.97, 0.90, 0.93, 0.97, and 0.97, respectively. The RPD values of those fractions were 8.68, 8.26, 6.11, 3.08, 3.69, 5.97, and 5.81, respectively. The Formula and RPD values of fractions B1 and B3 were 2.67 and 0.87 (B1) and 2.51 and 0.75 (B3) directly predicted by SVM-NIRS model. In this study, the chemical analysis results of B1 and B3 were also correlated with calculated results from TCP-BICP and NDFCP-ADFCP, which were predicted by SVM-NIRS models. The B1 protein fraction achieved Formula and RPD values of 0.87 and 3.61, whereas values for B2 were 0.75 and 2.00. Data suggested that use of SVM methods in NIRS technology could improve the accuracy of predicting protein fractions. This study showed the potential of increasing the NIRS prediction accuracy to a level of practical use for all protein fractions, except B3.

Key Words: support vector machine • protein fraction • alfalfa • near infrared reflectance spectroscopy


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Forage protein nutritive value is determined by the content of total CP and protein degradability (Blaxter, 1956). Furthermore, ruminal degradation of feed protein depends upon the rate of degradation of different individual protein fractions. Many studies (Knowlton et al., 1992; Fox et al., 1995) have proposed further subdivision of protein fractions. The Cornell Net Carbohydrate and Protein System (CNCPS) developed new models to separate the feed protein fractions based on the difference in the rate and extent of ruminal protein degradability. In the CNCPS, Sniffen et al. (1992) described 5 protein factions: NPN is denoted as fraction A; true protein is broken down into fractions B1, B2, and B3, based on different decreasing solubility (rapid, intermediate, and slow, respectively); and fraction C represents the unavailable true protein. The 5 fractions are dependent upon the estimation of insoluble protein, true protein, and the protein residual in NDF and ADF. However, determination of the protein fractions in CNCPS is a tedious process because the feed samples need to be treated with tungstic acid (or trichloroacetic acid), borate-phosphate buffer, neutral detergent or acid detergent solutions, and 5 Kjeldahl procedures (Krishnamoorthy et al., 1982; Licitra et al., 1996). The analysis takes 2 d or more. Compared with these conventional laboratory procedures, near infrared reflectance spectroscopy (NIRS) offers advantages of high efficiency and low cost. It has been used widely and successfully in evaluation of forage and feedstuff nutrition (Norris et al., 1976; Shenk et al., 1981; Abrams et al., 1987; Givens et al., 1997).

The analysis of protein fractions with NIRS relies on both the ability of laboratory procedures to precisely measure protein fractions of interest and the significant correlation between particular amide bonds of protein fractions and sample spectra. Some protein fractions, however, cannot be accurately predicted by NIRS technology because their detection is affected by interactions between N and cell wall constituents or carbohydrates. Therefore, it is necessary to apply chemometric methods to analyze and extract the particular information. Classical chemometric methods such as partial least squares (PLS) are widely used in classification and regression analysis. There has been an increasing awareness of the potential of using a more recent technique, support vector machine (SVM), in NIRS analysis. Support vector machine, proposed by Vapnik (1998), is based on the statistical learning theory of structural risk minimization in Vapnik-Chervonenkis dimension. Support vector machines map input vectors into a higher dimensional space where maximal separating hyperplanes are constructed; the generalisation error decreases as the margin or distance between separated hyperplanes increases. It offers advantages over conventional statistical learning algorithms: 1) high generalization performance even with high dimension feature vectors; 2) the ability to manage kernel functions that map input data to higher dimensional space without increasing computational complexity; 3) better performance in dealing with nonlinear data. Compared with artificial neural network, which uses the empirical risk minimization principle, SVM is superior in avoiding overfitting and multidimensional problems, especially in a small sample set. According to the literature, SVM has been successfully applied to many applications such as drug design and combinatorial chemistry, but there are few reports in the animal and feed science area. Fernández Pierna et al. (2004) studied the SVM-NIRS measurement of meat and bone in compound feeds, and Wu et al. (2007) reported on the successful prediction of the content in milk powder by infrared spectroscopy.

At present, estimation of CNCPS protein fractions of forages and TMR by NIRS has been studied (Hoffman et al., 1999; Mentink et al., 2006; Valdés et al., 2006). However, most of these studies could only predict part of these fractions with an acceptable accuracy. As far as we know, there is no research showing all of the protein fractions predicted accurately in the same study. On the other hand, no research has described the accuracy of NIRS prediction of protein fractions by SVM method.

To explore the potential for SVM regression (SVR) method to improve the precision of predicting protein fractions by NIRS technology, SVM was applied in NIRS prediction of protein fractions in the study. Moreover, PLS models were calibrated to compare with SVM-NIRS models.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Sample Preparation and Laboratory Analyses
Alfalfa samples (n = 230) were collected from 5 sites of experimental and commercial farms in Gansu Province (northwest China, 39.81°N, 97.78°E to 39.14°N, 99.84°E) during 2005 and 2006. The detailed information of sample origins is shown in Table 1Go. Samples included 4 varieties, multiple maturity stages, and 3 harvests. Approximately 2 kg of fresh alfalfa sample was collected at 50 mm above the soil level and then oven-dried at 65°C for 48 h. Dry samples were ground in a cyclone mill (Cyclotec 1093, Foss, Hillerød, Denmark) fitted with a 1-mm screen for chemical analyses and NIRS scan.


View this table:
[in this window]
[in a new window]

 
Table 1. Description of the origin of alfalfa samples used in the present study
 
According to Licitra et al. (1996), 4 steps are used to separate CP. First, the sample is treated with tungstic acid to precipitate out true protein (TCP). Fraction A is NPN x 6.25 and is calculated by subtracting TCP from total CP. Second, samples are treated with borate-phosphate buffer solution. Fraction B1 is buffer-soluble true protein and calculated by TCP minus buffer-insoluble protein (BICP). Third, the sample is refluxed in neutral detergent solution to separate sample protein; B2 is the part dissolved in neutral detergent solution and calculated by BICP minus neutral detergent-insoluble protein (NDFCP). In the last step, residual NDFCP of sample is put in acid detergent solution and fraction B3 is acid detergent-soluble protein, calculated by NDFCP minus acid detergent-insoluble protein (ADFCP); fraction C is defined as ADFCP.

Samples were analyzed for DM at 105°C for 3 h. Total CP and the protein of the other 4 precipitated parts (TCP, BICP, NDFCP, and ADFCP) were determined by the Kjeldahl method described by AOAC (2000). The NDF and ADF concentrations were determined according to Van Soest et al. (1991). All of the wet chemistry determinations were made in 3 replications.

NIRS Analysis.
Alfalfa samples were scanned on a Nicolet Antaris FT-Near Infrared Analyzer (Thermo Electron Corp., Madison WI) equipped with an integrating sphere. Samples were put in a quartz cup (48-mm inner diameter) and subsequently scanned 64 times with 3 replications over 1,000 to 2,500 nm at the resolution of 8 cm–1. All spectra data were recorded as log (1/R) and a mean spectrum was calculated for each sample (Nie et al., 2007). Mathematical processing was performed using TQ Analyst software (Thermo Nicolet Corp., Madison, WI). All 230 samples were randomly divided into 2 groups for calibration (n = 150) and prediction (n = 80).

PLS Regression.
The PLS regression was performed using TQ Analyst software. According to the Diagnostics module of the TQ Analyst software, parameters in mathematical processing were sought through trials to optimize the models. Main parameters in this software involved data format with the options of original spectrum, first derivative, and second derivative spectrum. The subsequent option of derivative was the filter to smooth data (Nie et al., 2007). Norris derivative (ND) filter was applied to smooth the samples spectrum in this study and it was followed with options of segment length and gap between segments, which were the main selected parameters in this study.

All parameters for a new model were tested using leave-one-out cross-validation and best model should be selected with the minimum root mean square error of cross-validation (RMSECV). In addition, the ratio of standard error of performance to the standard deviation of the reference data (RPD), which is calculated as standard deviation of prediction sets (SD)/standard error of prediction (SEP), was considered as a crucial criterion of the NIRS equations (Williams and Norris, 2001). Root mean SEP was calculated from (Naes et al., 2002):


Formula

Standard error of prediction was calculated by the following formula (Naes et al., 2002):


Formula

where xi – yi = difference between results obtained by routine method (xi) and reference method (yi) on sample I, and bias is given by


Formula

where N = total number of samples in the test. The root mean square error of calibration (RMSEC) and RMSECV were obtained from calibration and cross-validation data sample sets, respectively, and calculated in the same equation as RMSEP (Naes et al., 2002). The repeatability standard deviation (Sr) of the reference methods (laboratory procedures) was calculated to provide the laboratory determination error in reference methods. A lower Sr value indicates a more precise laboratory measurement. The range (maximum minus minimum) to Sr ratio (RSrR) value was another useful indicator of the probability of successful NIRS calibration and performance, according to Williams (1987).

SVM Regression.
According to Vapnik (1998), the basic idea of SVM method is to map the data into a higher dimensional feature space via nonlinear mapping ({Phi}) and then to do regression in this space. Therefore, regression approximation addresses the problem of estimating a function based on a given data set G = {(xi, di)}i=1l, where xi is the input vector and di is the desired value; SVM approximates the function in the following form:


Formula 1[1]

where {{Phi}i(x)}i=1l is the set of mappings of input features, and {wi}i=1l and b are coefficients. They are estimated by minimizing the regularized risk function R(C):


Formula 2[2]

where


Formula 3[3]

and {varepsilon} is a prescribed parameter. In Equation [2], Formula 3 is the so-called empirical error (risk), which is measured by {varepsilon}-insensitive loss function L{varepsilon} (d, y). This equation indicates that C does not penalize errors below {varepsilon}. The second term, Formula 3, is used as a measurement of function flatness. C is a regularized constant determining the tradeoff between the calibration error and the model flatness. Introduction of slack variables "{xi}" leads Equation [2] to the following constrained function:


Formula 4[4]


Formula 5[5]

Thus, decision function Equation [1] becomes the following form:


Formula 6[6]

In Equation [6], {alpha}i, {alpha}i* are the introduced Lagrange multipliers. They satisfy the equality {alpha}i · {alpha}i* = 0, {alpha}i ≥ 0, {alpha}i* ≥ 0; i = 1,...,l, and are obtained by maximizing the dual form of Equation [4] which has the following form:


Formula 7[7]

with the following constraints:


Formula 8[8]

Based on the Karush-Kuhn-Tucker conditions, only several coefficients ({alpha}i{alpha}i*) will assume nonzero values, and the data points associated with them could be referred to as support vectors. In Equation [6], K (xi, xj) is the kernel function. The value is equal to the inner product of 2 vectors {Phi}(xi) and {Phi}(xj) in the feature space {Phi}(x). That is, K (xi, xj) = {Phi}(xi) {Phi} (xj). The elegance of using kernel function lies in the fact that one can deal with feature spaces of arbitrary dimensionality without having to compute the map {Phi} (x). Any function that satisfies Mercer’s condition can be used as kernel function.

In the work, the SVM regression was carried out on the MASTER software package (Beijing Petrochemical Design Institute, Beijing, China). Best models and the relative parameters were also selected by minimum RMSECV and RPD. In the MASTER software, parameters C and {varepsilon} were optimized in 4 types of kernel functions, which include linear, polynomial, radial basis function (RBF), and S-type kernel function. For parameter selection, C values ranged from 1 to 50 with incremental steps of 1, and {varepsilon} values ranged from 0.01 to 0.1 with incremental steps of 0.01 (Gu et al., 2006; Liu et al., 2006).


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
Alfalfa Protein Fractions
Results of analysis of laboratory and CNCPS protein fractions of alfalfa samples in the calibration and validation sets are presented in Table 2Go. Calibration and prediction sets were similar in range, average, and SD for all variables. Crude protein, TCP, BICP, and CNCPS fractions A and B2 had wider range of contents, higher average value (>6%) and SD (>10%) than other protein fractions. From the error statistics of laboratory protein fractions, Table 2Go shows the lower Sr and higher RSrR values compared with the report of TMR in Mentink et al. (2006). Precise wet chemistry measurements increase the probability of successful NIRS calibration and prediction (Williams, 1987). In the current study, the NDFCP and ADFCP had lower Sr than the other 3 laboratory protein fractions, but the RSrR values were similar to those of other protein fractions, which ranged from 126.11 to 178.75.


View this table:
[in this window]
[in a new window]

 
Table 2. Statistics1 of chemical analysis for laboratory and Cornell Net Carbohydrate and Protein System protein fractions in calibration (Cal.) and prediction (Pred.) samples sets
 
PLS-NIRS Models of Protein Fractions.
According to Table 3Go, the PLS-NIRS equations indicated the most outstanding ability of prediction of CP content of alfalfa hays (Formula 8 = 0.96, RPD = 5.07). The laboratory protein fraction TCP achieved Formula 8 and RPD values of 0.91 and 3.31, which showed the practicability of NIRS model. Borate-phosphate buffer–insoluble protein was estimated accurately by PLS regression, with the highest Formula 8 and RPD values (0.94, 3.98) other than those for CP in this study. Prediction of NDFCP for alfalfa hays by NIRS was considerably less accurate (Formula 8 = 0.75, RPD = 1.98). The last laboratory protein fraction ADFCP, which is also the CNCPS fraction C, was estimated with medium accuracy by NIRS (Formula 8 = 0.83, RPD = 2.42) in this study. According to Williams and Norris (2001), NIRS equations with RPD <3 are not recommended in application. As a result, the ADFCP (C) NIRS equation still could not be taken to practical use. The PLS-NIRS equations of CNCPS fraction A and B2 were developed accurately with better Formula 8 (0.94, 0.93) and RPD values (3.96, 3.91). Fractions B1 and B3 were estimated less accurately by PLS-NIRS in this study (Formula 8 < 0.62, RPD < 1.62).


View this table:
[in this window]
[in a new window]

 
Table 3. Statistics1 for partial least squares-near infrared reflectance spectroscopy model analysis of laboratory and Cornell Net Carbohydrate and Protein System protein fractions
 
On the whole, CP and laboratory protein fractions TCP, BICP, and CNCPS fractions A and B1 could be predicted with an acceptable degree of accuracy (RPD >3), whereas NDFCP, ADFCP, B1, and B3 could not be estimated accurately by PLS-NIRS in this study. Based on RPD values, we deduced the ranking of accuracy of protein fractions predicted by PLS-NIRS equations to be CP > BICP > A > B2 > TCP > ADFCP (C) > NDFCP > B3 > B1.

SVM-NIRS Models of Protein Fractions.
It is important for SVM regression to choose the correct number of latent variables. In this research, optimal latent variables of these protein fractions were selected according to the minimum RMSECV. Taking into account all of the protein fractions, the number of latent variables of SVM-NIRS models in this study was 15, which meant that latent variables 1 to 15 were used in SVM regression. For instance, Figure 1Go showed the values of the RMSECV using different numbers (1 to 15) of latent variables of fraction A.


Figure 1
View larger version (11K):
[in this window]
[in a new window]

 
Figure 1. Root mean square error of cross-validation (RMSECV) of different numbers of latent variables in support vector machine-near infrared reflectance spectroscopy model of fraction A.

 
After choosing the correct number of latent variables, leave-one-out cross-validation was carried out to select the proper kernel functions by lowest RMSECV between various parameter combinations and 4 types of kernel functions. The best modeling modulus used in SVR-NIRS models of all these protein fractions are shown in Table 4Go. All protein fractions obtained lowest RMSECV from linear kernel function. All of the optimized C values were less than 10 and the {varepsilon} values were below 0.05. Compared with the PLS models in Table 3Go, the SVM-NIRS models in this study strongly decreased the RMSECV value, which is an obvious signal of accuracy improvement of SVM-NIRS models.


View this table:
[in this window]
[in a new window]

 
Table 4. Parameters1 used in support vector machine-near infrared reflectance spectroscopy models of protein fractions
 
The statistical result of SVM-NIRS model calibration and prediction are shown in Table 5Go. The Formula 8 and RPD values were 0.90 and 3.08 for NDFCP and 0.93 and 3.69 for ADFCP (C), respectively. The CP, TCP, BICP, A, and B2 fractions achieved greater RPD values ranging from 5.81 to 8.68. According to Williams and Norris (2001), NIRS equations with RPD values greater than 3 indicate efficient NIRS predictions. The model can be used in quality control and even process control when the RPD value is greater than 5. Therefore, the SVM-NIRS models of CP and all protein fractions except B1 and B3 achieved sufficient accuracy for relative research and practical applications. The RPD values of B1 and B3 were lower than 3 and they were not recommended for practical use.


View this table:
[in this window]
[in a new window]

 
Table 5. Statistics1 for calibration and prediction of support vector machine-near infrared reflectance spectroscopy models of laboratory and the Cornell Net Carbohydrate and Protein System protein fractions
 
The calibration and prediction of SVM-NIRS models were improved to varying extents compared with PLS models. The accuracy of the SVM model of CP increased with RMSEP = 0.60% and RPD = 8.68 (Table 5Go), compared with 0.78% and 5.07 with the PLS model (Table 3Go). This result was similar to the reports in many other studies (Hoffman et al., 1999; Andres et al., 2005; Mentink et al., 2006; Valdés et al., 2006). The reason for the accurate prediction of CP content by NIRS technology is the highly significant correlation between the absorption of amide bonds, involving protein content analyzed by the Kjeldahl method (Shenk and Westerhaus, 1994).

All of the SVM-NIRS models of protein fractions were improved, because the coefficient of determination, root mean square error, and the RPD values of their SVM models were all better than PLS models. Between those fractions, TCP, BICP, A, and B2 performed much better with Formula 8 of 0.97, 0.97, 0.97, and 0.97, and RPD of 8.26, 6.11, 5.97, and 5.81, respectively. In contrast, the PLS-NIRS model Formula 8 values of those fractions in this study were between 0.91 and 0.94, and the RPD values ranged from 3.31 to 3.98. To our knowledge, the only related report in Valdés et al. (2006) showed that fractions BICP, A, and B2 could not be accurately predicted in botanically heterogeneous permanent meadows by PLS-NIRS, because of the low Formula 8 values of 0.66, 0.42, and 0.00, and RPD values of 1.23, 1.79, and 0.96, respectively.

It is important to point out that the SVM-NIRS model accuracy of NDFCP and ADFCP (C) were increased to a practical extent, with Formula 8 and RPD values of 0.90 and 3.08 and 0.93 and 3.69 for NDFCP and ADFCP (C), respectively (Table 5Go). In PLS models in this study, the Formula 8 and RPD values were 0.75 and 1.98 for NDFCP and 0.83 and 2.42 for ADFCP (Table 3Go). In other studies, predictions of NDFCP and ADFCP (C) by NIRS were also less accurate. Hoffman et al. (1999) observed an unsuccessful NDFCP and ADFCP PLS-NIRS model for legume and grass silage with low RCAL2 (0.72, 0.77) and Formula 8 (0.84, 0.42) values. Valdés et al. (2006) obtained an accurate NDFCP PLS-NIRS equation with Formula 8 and RPD values of 0.91 and 3.33, but the PLS-NIRS model had low accuracy for ADFCP, with Formula 8 and RPD values of 0.72 and 2.14.

According to Tables 3Go and 5Go, the SVM-NIRS model accuracy of fraction B1 and B3 were also improved, in contrast with PLS models in this study and in Valdés et al. (2006). However, with Formula 8 values of 0.83 and 0.73 and RPD values of 2.67 and 2.51, 2 SVR-NIRS equations were still not practicable.

Because protein fractions B1 and B3 were calculated from the laboratory fractions TCP, BICP, NDFCP, and ADFCP, which were accurately predicted by SVR-NIRS equations in this study, they were recalculated by 4 laboratory fractions from the SVR-NIRS predicted data and correlated with original calculation results from laboratory-analyzed data. Table 6Go shows the statistics of 2 types of calculation results. It was obvious that B1 achieved better performance from the calculation based on SVM-NIRS–predicted data, compared with the directly predicted results by SVR-NIRS equations. The R2 of prediction increased from 0.83 to 0.87, and the RPD value reached 3.61, making it adequate for practical use. Meanwhile, the new calculation result of B3 did not improve and actually decreased, with R2 = 0.75 and RPD = 2.00.


View this table:
[in this window]
[in a new window]

 
Table 6. Correlation statistics1 between the directly predicted results by support vector machine-near infrared reflectance spectroscopy (SVM-NIRS) and indirectly calculated results from other laboratory fraction constituents based on SVM-NIRS prediction in the prediction sample set
 
Generally, high accuracy of NIRS prediction has 3 basic requirements: 1) precise measurement of the laboratory procedures; 2) exact quality and quantity analyses for the characteristic chemical bonds (such as nitrogen). Because detection of character information by NIRS could be strongly interfered with difference between similar chemical bonds or other intense absorbance bonds; and 3) a certain content and range of nutrient values over the NIRS limit of determination (0.1%). According to the error statistics in Table 2Go, the Sr values of all CP and laboratory protein fractions ranged from 0.01 to 0.09%, it showed the qualified wet chemistry determinations of those fractions in this study, which showed that the wet chemistry determinations of those fractions were done accurately in the present study. Therefore, errors of wet chemistry determinations were not the main restricted reason for different accuracy of NIRS equations. However, because of the low range expressed by the low SD value (<1%) and low content of NDFCP and ADFCP (C), those 2 fractions did not derive precise NIRS equations by PLS regression, although the RSrR values of NDFCP (170.33) and ADFCP (C) (139.44) were similar to those of other protein fractions. Results in Valdés et al. (2006) could be helpful for explaining our results, because they obtained an accurate NDFCP NIRS equation with mean and SD values of 12.3 and 2.26%, while gaining an inaccurate equation for ADFCP with lower mean and SD values of 4.1 and 1.44%. The CNCPS fractions were calculated from laboratory protein fractions, so the errors, ranges, and contents of CNCPS protein fractions were affected by the relative laboratory protein fractions. As a result, the performance of NIRS prediction of CNCPS protein factions was directly related to the accuracy of NIRS equations of the certain laboratory protein fractions. Therefore, protein fractions A and B2, which were calculated from CP – TCP and BICP –NDIP, obtained more precise NIRS equations, whereas PLS-NIRS prediction of fractions B1 and B3 were not accurate with lower range, SD, and average values.

Support vector machine regression improved the accuracy of NIRS prediction of all protein fractions based on the structural risk minimization statistical learning theory. It is more significant that performance of protein fractions NDFCP, ADFCP (C), and B1 were increased to an accurate level with R2 > 0.87 and RPD > 3.0 by direct and indirect NIRS prediction. This mainly resulted from the better performance of SVM regression when treating nonlinear data such as those 3 fractions. Because the range (from 0.89 to 4.67%) and SD (0.30 to 1.91%) were low, nonlinear relationships were produced between spectral absorbance data and the content of those nutrients. These fractions were too concentrated to be predicted precisely by PLS-NIRS, whereas a better result was achieved by application of the SVM method. However, both the PLS and SVM methods failed to perform an accurate NIRS prediction for protein fraction B3, even though it was improved by SVM regression compared with PLS-NIRS prediction. This was a comprehensive result of lowest content of fraction B3 (average 0.23%), which was already close to the limit of determination of NIRS technology, the smallest variability (SD = 0.20%) in all the fractions, and potential errors transferred from laboratory procedures (NDFCP and ADFCP).


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
In general, the SVM method was preferred to PLS in calibrating NIRS models of protein fractions. In many related studies, the PLS-NIRS solution suffered difficulties with producing models and generalization ability when dealing with nonlinear data sets. For instance, all the protein fractions (NDFCP, ADFCP, B1, B3, and C) with smaller SD values (<1%) in this study calibrated unavailable PLS models with low accuracy (Formula 8 < 0.85, RPD <3). Protein fractions could probably be predicted more accurately by NIRS in forages and feedstuffs that have greater N concentrations. However, the application of the SVM regression method in NIRS has made considerable progress in improving the accuracy of predicting laboratory and CNCPS protein fraction content. Compared with the known research, this study showed the potential of accurate and rapid prediction of all the laboratory protein fractions, TCP, BICP, NDFCP, ADFCP, and primary CNCPS fractions A, B2, and C by SVM-NIRS technology. In addition, fraction B1 was accurately calculated indirectly based on the SVR-NIRS prediction results. Those better results are explained by using SVM’s specific optimization procedure that avoids overfitting. Because of this advantage, it is possible to perform with greater data generalization ability, especially in these nonlinear data sets such as NDFCP and ADFCP content in this study. This result will be useful for applications of both the CNCPS and NIRS technology in dairy nutrition research. Further work should be carried on broadening the sample variance (such as species and cultivars) to increase the nitrogen content and range, particularly for NDFCP, ADFCP, and B3.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 
This work was supported by the National Agricultural Science and Technology Program (No.nyhyzx07–022) and 948 Program of China Agriculture Ministry (No. 2006-G38). The authors thank the Institute of Forage Production of Gansu Branch of Chengdu Daye International Investment Co. Ltd (Jiuquan, Gansu, China) for the chemical analysis work.

Received for publication January 1, 2008. Accepted for publication March 10, 2008.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 CONCLUSIONS
 ACKNOWLEDGEMENTS
 REFERENCES
 


Abrams, S. M., J. S. Shenk, M. O. Westerhaus, and F. E. Barton. 1987. Determination of forage quality by near infrared reflectance spectroscopy: Efficacy of broad-based calibration equations. J. Dairy Sci. 70:806–813.[Abstract/Free Full Text]

Andrés, S., A. Calleja, S. López, Á. R. Mantecón, and F. J. Giráldez. 2005. Nutritive evaluation of herbage from permanent meadows by near-infrared reflectance spectroscopy: 2. Prediction of crude protein and dry matter degradability. J. Sci. Food Agric. 85:1572–1579.[CrossRef]

AOAC. 2000. Official Methods of Analysis. 17th ed. Association of Official Analytical Chemists, Washington, DC.

Blaxter, K. L. 1956. The nutritive values of foods as sources of energy: A review. J. Dairy Sci. 39:1396–1424.[Abstract/Free Full Text]

Fernández Pierna, J. A., V. Baeten, A. Michotte Renier, R. P. Cogdill, and P. Dardenne. 2004. Combination of support vector machines (SVM) and near-infrared (NIR) imaging spectroscopy for the detection of meat and bone meal (MBM) in compound feeds. J. Chemometr. 18:341–349.[CrossRef]

Fox, D. G., M. C. Barry, R. E. Pitt, D. K. Roseler, and W. C. Stone. 1995. Application of the Cornell Net Carbohydrate and Protein Model for cattle consuming forages. J. Anim. Sci. 73:267–277.[Abstract]

Givens, D. I., J. L. De Boever, and E. R. Deaville. 1997. The principles, practices and some future applications of near infrared spectroscopy for predicting the nutritive value of foods for animals and humans. Nutr. Res. Rev. 10:83–114.[CrossRef]

Gu, T., W. Lu, X. Bao, and N. Chen. 2006. Using support vector regression for the prediction of the band gap and melting point of binary and ternary compound semiconductors. Solid State Sci. 8:129–136.[CrossRef]

Hoffman, P. C., N. M. Brehm, L. M. Bauman, J. B. Peters, and D. J. Undersander. 1999. Prediction of laboratory and in situ protein fractions in legume and grass silages using near-infrared reflectance spectroscopy. J. Dairy Sci. 82:764–770.[Abstract]

Knowlton, K. F., R. E. Pitt, and D. G. Fox. 1992. Dynamic model prediction of the value of reduced solubility of alfalfa silage protein for lactating dairy cows. J. Dairy Sci. 65:1507–1516.

Krishnamoorthy, U., T. V. Muscato, C. J. Sniffen, and P. J. Van Soest. 1982. Nitrogen fractions in selected feedstuffs. J. Dairy Sci. 65:217–225.[Abstract/Free Full Text]

Licitra, G., T. M. Hernandez, and P. J. Van Soest. 1996. Standardization of procedures for nitrogen fractionation of ruminant feeds. Anim. Feed Sci. Technol. 57:347–358.[CrossRef]

Liu, X., W. C. Lu, S. L. Jin, Y. W. Li, and N. Y. Chen. 2006. Support vector regression applied to materials optimization of sialon ceramics. Chem. Intell. 82:8–14.

Mentink, R. L., P. C. Hoffman, and L. M. Bauman. 2006. Utility of near-infrared reflectance spectroscopy to predict nutrient composition and in vitro digestibility of total mixed rations. J. Dairy Sci. 89:2320–2326.[Abstract/Free Full Text]

Naes, T., T. Isakson, T. Fearn, and T. Davies. 2002. A Userfriendly Guide to Multivariate Calibration and Classification. NIR Publications, Chichester, UK.

Nie, Z. D., J. G. Han, Z. Yu, L. D. Zhang, J. H. Li, Y. Zhong, and F. Y. Liu. 2007. Quality prediction of alfalfa hay using Fourier transform near infrared reflectance spectroscopy. Spectrosc. Spect. Anal. 27:1308–1311.

Norris, K. H., R. F. Barnes, J. E. Moore, and J. S. Shenk. 1976. Predicting forage quality by infrared reflectance spectroscopy. J. Anim. Sci. 43:889–897.[Abstract/Free Full Text]

Shenk, J. S., I. Landa, M. R. Hoover, and M. O. Westerhaus. 1981. Description and evaluation of a near infrared reflectance spectro-computer for forage and grain analysis. Crop Sci. 21:355–358.[Abstract/Free Full Text]

Shenk, J. S., and M. O. Westerhaus. 1994. Forage Quality, Evaluation and Utilization. American Society of Agronomy, Crop Science Society of America and Soil Science Society of America, Madison, WI.

Sniffen, C. J., J. D. O’Connor, P. J. Van Soest, D. G. Fox, and J. B. Russell. 1992. A net carbohydrate and protein system for evaluating cattle diets. II. Carbohydrate and protein availability. J. Anim. Sci. 70:3562–3577.[Abstract]

Valdés, C., S. Andres, F. J. Giraldez, R. Garcia, and A. Calleja. 2006. Potential use of visible and near infrared reflectance spectroscopy for the estimation of nitrogen fractions in forages harvested from permanent meadows. J. Sci. Food Agric. 86:308–314.[CrossRef]

Van Soest, P. J., J. B. Robertson, and B. A. Lewis. 1991. Methods for dietary fiber, neutral detergent fiber and nonstarch polysaccharides in relation to animal nutrition. J. Dairy Sci. 75:3583–3597.

Vapnik, V. 1998. Statistical Learning Theory. John Wiley and Sons Inc., New York, NY.

Williams, P., and K. Norris. 2001. Near-infrared technology in the agricultural and food industries. Am. Assoc. Cereal Chem., St. Paul, MN.

Williams, P. C. 1987. Variables affecting near-infrared reflectance spectroscopic analysis. Pages 143–168 in Near-infrared Technology in the Agricultural and Food Industries. Am. Assoc. Cereal Chem., St. Paul, MN.

Wu, D., S. Feng, and Y. He. 2007. Infrared spectroscopy technique for the nondestructive measurement of fat content in milk powder. J. Dairy Sci. 90:3613–3619.[Abstract/Free Full Text]


This article has been cited by other articles:


Home page
J DAIRY SCIHome page
Z. Nie, G. F. Tremblay, G. Belanger, R. Berthiaume, Y. Castonguay, A. Bertrand, R. Michaud, G. Allard, and J. Han
Near-infrared reflectance spectroscopy prediction of neutral detergent-soluble carbohydrates in timothy and alfalfa
J Dairy Sci, April 1, 2009; 92(4): 1702 - 1711.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Interpretive Summary
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Nie, Z.
Right arrow Articles by Liu, X.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nie, Z.
Right arrow Articles by Liu, X.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS