|
|
||||||||
College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310029, China
1 Corresponding author: yhe{at}zju.edu.cn
| ABSTRACT |
|---|
|
|
|---|
Key Words: near- and mid-infrared spectroscopy fat milk powder least-squares support vector machine
| INTRODUCTION |
|---|
|
|
|---|
Fat is an important nutritional component in milk powder. Improper content of fat ingestion can cause obesity in humans, and consequently increase the incidence of hypertension, coronary heart disease, and diabetes. Currently, there are several methods widely applied for measuring the fat content in dairy products including the Röse-Gottlieb, Soxhlet extraction, Babcock, and Gerber methods (Rosenthal and Baruch, 1993), and gas chromatography (Deeth et al., 1983). However, there are several problems inherent in these methods—they are time consuming, complex, and they require sample destruction.
Infrared spectroscopy techniques, such as near infrared and mid infrared spectroscopy (NIRS and MIRS), offer quick analysis, minimal sample preparation requirements, and low cost. Infrared spectroscopy techniques have been widely applied as fast and nondestructive analytical methods. Although infrared spectroscopy techniques cannot always provide definitive compositional information about a food sample, they can often provide a means of screening food products for qualitative attributes without the involvement of time-consuming chemistry analyses in laboratories (Reid et al., 2005). Near infrared light (4,000 to 12,500 cm–1) lies between the visible and mid infrared regions and has been widely applied for the detection of protein, fat, and total solids in cheese and milk (Rodriguez-Otero et al., 1995; Laporte and Paquin, 1999). Hermida et al. (2001) applied NIRS to analyze moisture, SNF, and fat in butter. Sorensen and Jepsen (1998) assessed the sensory properties of cheese by using NIRS. The NIRS bands mainly correspond to C–H, O–H, and N–H vibrations, which originate from fundamental bands in the mid infrared region. Mid infrared light scanned in the region of 400 to 4,000 cm–1 is applied to the detection of compositional differences between samples based on vibrations of various chemical groups at specific wavelengths in the mid infrared region of the spectrum (Reid et al., 2005). The frequencies and intensities of specific fundamental absorption bands can provide MIRS information related to the relevant functional groups.
However, because of the characteristics of overtone and combination, the molar absorptivity of NIRS is low (i.e., low sensitivity). In contrast, the MIRS bands are fundamental; thus, the peaks of MIRS are specific, sharp, and sensitive (i.e., high molar absorptivity). The spectral reproducibility of MIRS including signal-to-noise radio is poor compared with NIRS (Chung et al., 1999). Thus, 2 techniques have independent advantages and disadvantages and need to be considered and applied according to different requirements. In recent years, several articles have compared the performance of MIRS and NIRS. Reid et al. (2005) applied MIRS and NIRS techniques, respectively, to differentiate the basis of heat treatment and variety of apple juice samples. Bras et al. (2005) compared and combined NIRS and MIRS spectra in calibrations of soybean flour. Chung et al. (1999) compared NIRS and MIRS for the determination of distillation properties of kerosene. Takeuchi et al. (2006) investigated the states of H2O adsorbed on oxides by NIRS and MIRS.
Because milk powder is a highly complex matrix, the corresponding infrared spectral data contain highly overlapped peaks regardless of bandwidth. It is necessary to use chemometric methods to extract the relevant quantitative information. Although the establishment of a chemometric model is time consuming, analysis can be performed quickly once the model is established (Bras et al., 2005). In this study, a support vector machine (SVM) was developed using both NIRS and MIRS.
Support vector machine is a universal learning method proposed by Vapnik (1998). It uses a function called kernel to map the data input space to a high-dimensional feature space with fewer training data. In this space, the problem becomes linearly separable (Burges, 1999). Support vector machine applies the structural risk minimization principle, which is superior to the traditional empirical risk minimization principle used by conventional neural networks, to avoid overfitting and multidimensional problems. Least-squares support vector machine (LS-SVM; Suykens et al., 2002) is a modified version of SVM that applies least squares error in the training error function. Least-squares support vector machine has the capability for linear and nonlinear multivariate calibration and solves the multivariate calibration problems in a relatively rapid way (Suykens and Vanderwalle, 1999). The learning problem is formulated and represented as a convex quadratic programming problem (Lu et al., 2003) to obtain the support vectors. It adopts a least-squares linear system as the loss function and is applied in the pattern recognition and nonlinear evaluation. Due to its attractive advantages and excellent performance, LS-SVM has attracted attention and gained extensive application in spectral analysis (Chauchard et al., 2004; Borin et al., 2006)
The aim of this study was to investigate the potential of the infrared spectroscopy technique for nondestructive measurement of fat content in milk powder. The LS-SVM model was compared with a back-propagation artificial neural network (BP-ANN) model based on statistical parameters of prediction results. Moreover, to determine the optimum spectral range, this study evaluated the detection of fat content based on NIRS and MIRS.
| MATERIALS AND METHODS |
|---|
|
|
|---|
25°C. Each milk powder sample was mixed with potassium bromide at a ratio of 1:49 to enhance the transmission rate. The mixture was compressed into a uniform tablet with a diameter of 5 mm and a thickness of 2 mm and then scanned by the spectrometer. Each sample was scanned 40 times; these 40 data points were averaged as the transmission value (%) of the sample. The spectral data of approximately 60 samples were measured for each brand, and finally the spectral data of 409 samples were obtained. Spectra Manager CFR software (Jasco) was used for spectral measurement and analysis. Due to system imperfections, obvious scattering noise was observed at the beginning and end of the spectral curve. Thus, the first 52 and last 1,175 transmission data points were eliminated to improve the measurement accuracy. The NIRS analysis was based on 4,000 to 6,666 cm–1 and the MIRS analysis from 400 to 4,000 cm–1. The fat content of each sample was measured by the Röse-Gottlieb method following GB/T 5413.3-1997 (National Standard of China; AOAC, 1990). The fat content value was expressed as grams of fat per one hundred grams of milk powder.
LS-SVM
Assume a set of training samples is given as {xk, yk}K=1N, with the input xk
RN and the output yk
R. The LS-SVM model is given as y(x) = wT
(x) + b at the beginning, where w
Rn is the weight vector and b is the bias. A nonlinear mapping function,
(), is applied to transform the low dimensional nonlinear input data space into a higher dimensional feature space. The regularized least-squares cost function is given as follows:
![]() |
The constraint is given as follows:
![]() |
where y is the regularization parameter that balances the models complexity and the training errors, and ek is the random error. Then, the Lagrange function was adopted to solve this optimization problem:
![]() |
where
k is a Lagrange multiplier called support value. The equation is optimized by partially differentiating with respect to each variable:
![]() |
When the variables w and e are removed, the equation can be rewritten as a linear function group:
![]() |
with
![]() |
According to Mercers theory (Vapnik, 1995), there is a relationship between the mapping function,
(), and the kernel function, K(xi, xj), as
![]() |
There are several common kernel functions such as linear, polynomial, radial basis function (RBF) kernel, and multilayer perceptron (MLP). In our work, RBF kernel was selected as the kernel function:
![]() |
The LS-SVM regression model can be obtained thus:
![]() |
In the SVM or LS-SVM process, the determinations of the proper kernel function and the best kernel parameters need to be solved. The RBF kernel is a nonlinear function and a more compacted supported kernel. The RBF kernel can reduce the computational complexity of the training procedure while giving good performance under general smoothness assumptions. Proper parameter setting plays a crucial role in building a good LS-SVM regression model with high prediction accuracy and stability. We used a grid-search technique to determine the optimal parameter values such as regularization parameter gam (
) and the RBF kernel function parameter sig2 (
2), which is the bandwidth in the common case of the RBF kernel. Parameter
determines the trade-off between structural risk minimization and empirical risk minimization and is important to improve the generalization performance of LS-SVM model. Parameter
2 controls the value of function regression error, and influences the number of initial eigenvalues or eigenvectors directly. Small values of
2 yield a large number of regressors and eventually can lead to overfitting. On the contrary, a large value of
2 can lead to a reduced number of regressors, making the model simpler, but ultimately less accurate. Moreover,
2 reflects the sensitivity of LS-SVM model to the noise from input variables. All the aforementioned calculations were performed using Matlab 7.0 (Math Works, Natick, MA). The freely available LS-SVM toolbox (LS-SVM v. 1.5, J. A. K. Suykens, K. U. Leuven, Leuven, Belgium) was applied with Matlab to derive all of the LS-SVM models.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
A total of 409 samples was split into 2 groups randomly, 339 of which were used for calibration, and the remaining 70 samples (10 for each brand) were for prediction. Least-squares support vector machine was performed with RBF kernel. Parameters
and
2 were optimized in the range of 2–1 to 210 for
and 2 to 215 for
2. For each combination of
and
2 parameters, the root mean square error of cross-validation (RMSECV) was calculated and the optimum parameters were selected when the minimum RMSECV was obtained. The optimizing process is shown in Figure 2
. The grid indicated by the diamond symbols in the first step is 10 x 10, and the searching step in the first step is large. The optimal search area was determined by error contour line. The grid indicated by the "x" symbols in the second step is 10 x 10, and the searching step in the second step is smaller. The optimal search area was determined based on the first step. Finally, the optimal pair of (
,
2) was determined as 55.168 and 19.5531, respectively. To assess the relative robustness of LS-SVM regression method for the fat content, the determination coefficient for prediction (R2P), and root mean square error for prediction (RMSEP) were calculated. The high R2P of 0.9796 with low RMSEP of 0.836708 showed that LS-SVM had a strong ability for regression analysis (Figure 3
).
|
|
|
From Table 1
, it was concluded that the LS-SVM model performed better in the MIRS region than in the NIRS region. By analyzing the original spectral transmission plots, it was seen that the curves in MIRS region were more complex than those in the NIRS region. The main difference of each curve in the NIRS plot was offset, and there was a lot of high-frequency noise in the NIRS region. These characteristics are useless for the model establishment. In the MIRS region, the curves are smoother, and there are great differences among each brand of milk powder. Because the bands in the MIRS are fundamental bands of NIRS bands, it provides more information about frequencies and intensities (Chung et al., 1999). Although it is hard to distinguish each brand by visual examination, the prediction results of 2 LS-SVM models showed that these complex curves contain more useful information than parallel curves and demonstrated that the LS-SVM model could extract the useful information successfully. Moreover, results based on the entire infrared region are more accurate than results from MIRS and NIRS; in particular, the RMSEP of prediction results in both MIRS and NIRS region are larger than that in the whole infrared region. However, the prediction results of fat content detection in either the MIRS or the NIRS region could satisfy the demand of consumers. Moreover, data processing based on the entire infrared spectral region is more complex, and the instrument based on the whole infrared spectral region is expensive. Thus, the determination of infrared spectral region should be chosen according to the required application.
| CONCLUSIONS |
|---|
|
|
|---|
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
Received for publication March 4, 2007. Accepted for publication April 18, 2007.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
Z. Nie, J. Han, T. Liu, and X. Liu Hot Topic: Application of Support Vector Machine Method in Prediction of Alfalfa Protein Fractions by Near Infrared Reflectance Spectroscopy J Dairy Sci, June 1, 2008; 91(6): 2361 - 2369. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Wu, S. Feng, and Y. He Short-Wave Near-Infrared Spectroscopy of Milk Powder for Brand Identification and Component Analysis J Dairy Sci, March 1, 2008; 91(3): 939 - 949. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |