JDS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Journal of Dairy Science Vol. 85 No. 9 2081-2097
© 2002 by American Dairy Science Association ®
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Jimenez-Marquez, S. A.
Right arrow Articles by Thibault, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Jimenez-Marquez, S. A.
Right arrow Articles by Thibault, J.

Statistical Data Validation Methods for Large Cheese Plant Database

S. A. Jimenez-Marquez*, C. Lacroix* and J. Thibault{dagger}

* Dairy Research Centre STELA, Pavillon Paul-Comtois, Université Laval, Québec, QC, Canada G1K 7P4
{dagger} Department of Chemical Engineering, 161 Louis Pasteur, University of Ottawa, Ottawa, ON, Canada K1N 6N5

Corresponding author:
C. Lacroix; e-mail:
christophe.lacroix{at}aln.ulaval.ca.

Production data of the cheesemaking process are used to monitor milk fat and protein recoveries in cheese, cheese yield, and composition and eventually to predict these parameters. Due to the large impact of these factors on cheese quality and plant profitability, it is very important to use reliable data for analysis, modeling, and control of the process. This paper tested six methods for detecting erroneous data in industrial cheesemaking databases. The data analyzed came from 4 yr of stirred-curd Cheddar cheese production in an industrial cheesemaking facility, comprising over 10,000 vats. Single vat outliers were detected using a simple statistical criterion of Formula ± 3.6 SD on single variable distributions, Fourier series modeling of seasonal variables (fat, protein, lactose, and total solids in milk, and protein in whey), and the multivariate Mahalanobis outlier analysis. Detection of outlier productions (corresponding to several vats) was done by applying the Formula ± 3.6 SD criterion to variables obtained through calculating the fat mass balance, fat retention coefficient, and yield efficiency. Data treatment enabled the detection of outlier data, but also pinpointed variables with a low reliability (manually registered times). Single variable and multivariable methods proved complementary, and the use of both types of methods is recommended when validating an existing database.

Abbreviation key: Eact-th = yield efficiency based on actual yield and theoretical yield, Eadj-th = yield efficiency based on adjusted actual yield and theoretical yield, FMB = fat mass balance, Kf = fat retention coefficient, MAE = mean absolute error, P1 or P2 = first or second principal component of transformed mass of fat in milk and that in cheese and whey, {Delta}R = percent decrease of the range of the data, S-CM = starter and its culture medium, {Delta}SD = percent decrease of the standard deviation of the data, SD/R = ratio of the standard deviation to the range expressed in percentage,, SSC = simple statistical criterion, Yact = actual yield, YVSPmod = modified Van Slyke & Price yield formula, Yadj = adjusted actual yield, Yth = theoretical yield

Key Words: statistical cheese data validation • seasonal variation • fat mass balance • yield efficiency







HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2002 by the American Dairy Science Association ®.