Referred Paper
Referred Paper
243–252
www.elsevier.comrlocaterchemometrics
Abstract
Principal component analysis ŽPCA. is the most commonly used dimensionality reduction technique for detecting and
diagnosing faults in chemical processes. Although PCA contains certain optimality properties in terms of fault detection, and
has been widely applied for fault diagnosis, it is not best suited for fault diagnosis. Discriminant partial least squares ŽDPLS.
has been shown to improve fault diagnosis for small-scale classification problems as compared with PCA. Fisher’s discrimi-
nant analysis ŽFDA. has advantages from a theoretical point of view. In this paper, we develop an information criterion that
automatically determines the order of the dimensionality reduction for FDA and DPLS, and show that FDA and DPLS are
more proficient than PCA for diagnosing faults, both theoretically and by applying these techniques to simulated data col-
lected from the Tennessee Eastman chemical plant simulator. q 2000 Elsevier Science B.V. All rights reserved.
Keywords: Fault diagnosis; Process monitoring; Pattern classification; Discriminant analysis; Chemometric methods; Fault detection; Large
scale systems; Multivariate statistics; Dimensionality reduction; Principal component analysis; Discriminant partial least squares; Fisher’s
discriminant analysis
0169-7439r00r$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.
PII: S 0 1 6 9 - 7 4 3 9 Ž 9 9 . 0 0 0 6 1 - 1
244 L.H. Chiang et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 243–252
mines the most accurate lower dimensional represen- mining the order of the dimensionality reduction
tation of the data in terms of capturing the data direc- without cross-validation. Then the proficiency of
tions that have the most variance. The resulting lower these techniques for fault diagnosis are evaluated by
dimensional models have been used for detecting application to data collected from the Tennessee
out-of-control status and for diagnosing disturbances Eastman chemical plant simulator.
leading to the abnormal process operation w2–7x.
Several applications of PCA to real chemical data
have been conducted at DuPont and other companies 2. Methods
over the past 6 years, with much of the results pub-
lished in conference proceedings and journal articles
w8–11x. Several academics have performed similar 2.1. PCA
studies based on data collected from computer simu-
lations of processes w3–6,12–15x. PCA is an optimal dimensionality reduction tech-
FDA provides an optimal lower dimensional rep- nique in terms of capturing the variance of the data.
resentation in terms of discriminating among classes PCA determines a set of orthogonal vectors, called
of data w16,17x, where for fault diagnosis, each class loading Õectors, which can be ordered by the amount
corresponds to data collected during a specific known of variance explained in the loading vector direc-
fault. Although FDA has been heavily studied in the tions. Given n observations of m measurement vari-
pattern classification literature and is only slightly ables stacked into a training data matrix X g R n= m ,
more complex than PCA, its use for analyzing chem- the loading vectors can be calculated via the Singular
ical process data is not described in the literature. This Value Decomposition ŽSVD.
is interesting, since FDA should outperform PCA 1
when the primary goal is to discriminate among X s US V T Ž 1.
faults. We suspect that part of the reason that FDA
'n y 1
has been ignored in the chemical process control lit- where U g R n= n and V g R m= m are unitary matri-
erature is that more chemical engineers read the ces and the diagonal matrix S g R n= m contains the
statistics literature Žwhere PCA is dominant. than the nonnegative real singular Õalues of decreasing mag-
pattern classification literature Žwhere FDA is domi- nitude Ž s 1 G s 2 G . . . G sm G 0.. The loading vec-
nant.. tors are the orthonormal column vectors in the ma-
Discriminant Partial Least Squares ŽDPLS., also trix V, and the variance of the training set projected
known as Discriminant Projection to Latent Struc- along the ith column of V is equal to s i2 .
tures, is a data decomposition method for maximiz-
ing covariance between predictor Žindependent. block
2.1.1. Fault detection
X and predicted Ždependent. block Y for each com-
Normal operations can be characterized by em-
ponent, where the predicted variables are dummy
ploying Hotelling’s T 2 statistic w2x:
variables Ž1 or 0. where ‘1’ indicates an in-class
member while ‘0’ indicates a non-class member T 2 s x T PSy2 T
a P x Ž 2.
w18–20x. DPLS computes a lower dimensional repre-
sentation, which maximizes the covariance between where P includes the loading vectors associated with
variables in that space and the predicted variables the a largest singular values, Sa contains the first a
w11x. Several researchers have applied DPLS and PCA rows and columns of S , and x is an observation vec-
to small-scale classification problems and showed tor of dimension m. Given a number of loading vec-
that DPLS improved class separation over PCA tors, a, to include in Eq. Ž2., the threshold can be
w18,21x. In general, fewer factors are needed in DPLS calculated for the T 2 statistic using the probability
to give the same level of prediction successes w21x. distribution
PCA, FDA, and DPLS and their application to
Ž n2 y 1. a
fault diagnosis are described next. An information Ta2 s F Ž a,n y a . Ž 3.
criterion for FDA and DPLS is developed for deter- n Ž n y a. a
L.H. Chiang et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 243–252 245
where Fa Ž a,n y a. is the upper 100 a % critical point The simplest approach is construct a single PCA
of the F-distribution with a and n y a degrees of model and define regions in the lower dimensional
freedom w22,23x. A value for the T 2 statistic, Eq. Ž2., space which classifies whether a particular fault has
greater than the threshold given by Eq. Ž3. indicates occurred w29x. This approach is unlikely to be effec-
that a fault has occurred. tive when a significant number of faults can occur w7x.
The portion of the measurement space corre- Another approach is to compute the group of process
sponding to the lowest m y a singular values can be variables which make the greatest contributions to the
monitored by using the Q statistic w12,24x. deviations in the squared prediction error and the
scores w3x. Although such information can narrow
Q s r T r, r s Ž I y PP T . x, Ž 4. down the search for an assignable cause of abnormal
behavior, it will not unequivocally diagnose the
where r is the residual vector. Since the Q statistic
cause. A related approach is to construct separate
does not directly measure the variations along each
PCA models for each process unit w12x. A fault asso-
loading vector but measures the total sum of varia-
ciated with a particular process unit is assumed to
tions in the space corresponding to the lowest m y a
occur if the PCA model for that unit indicates that the
singular values, the Q statistic does not suffer from
process is out-of-control. Again, this approach can
an over-sensitivity to inaccuracies in the lower singu-
narrow down the cause of abnormal process opera-
lar values w24x.
tions, it will not unequivocally diagnose the cause.
The threshold for the Q statistic can be computed
This distinguishes these fault isolation techniques
from its approximate distribution w24x Žwhich are based on non-supervised classification.
1r2 1rh 0 from the fault diagnosis techniques Žwhich are based
ca Ž 2 u 2 h 20 . u 2 h 0 Ž h 0 y 1. on supervised classification. of interest here. Diagno-
Qa s u 1 q1q
u1 u 12 sis approaches particular to sensor faults w30,31x will
also not be considered further here because the focus
Ž 5. is on more general types of faults.
A PCA approach which can handle general multi-
where u i s Ý njs a q 1 s j 2 i , h 0 s 1 y Ž2 u 1 u 3 .r Ž3u 22 .,
ple faults is to develop a separate PCA model based
and ca is the normal deviate corresponding to the
on data collected during each specific fault situation,
upper Ž1 y a . percentile.
and then apply the Q w32x, T 2 w4x, or other statistics
w4–7x applied to each PCA model to predict which
2.1.2. Reduction order fault or faults most likely occurred. This approach is
A key step in a dimensionality reduction tech- essentially a combination of principal component
nique is to determine the order of the reduction, that analysis and discriminant analysis w5x.
is, its dimensionality. There exist several techniques
for determining the number of loading vectors, a, to
maintain in the PCA model w12,25–28x. Parallel 2.2. Fisher’s discriminant analysis
analysis determines the dimensionality of the PCA
model by comparing the singular value profile to that
obtained by assuming independent measurement For fault diagnosis, data collected from the plant
variables. The dimension is determined by the point during specific faults is categorized into classes,
at which the two profiles cross. This approach is par- where each class contains data representing a partic-
ticularly attractive since it is intuitive, easy to auto- ular fault. FDA is an optimal dimensionality reduc-
mate, and performs well in practice. tion technique in terms of maximizing the separabil-
ity of these classes. It determines a set of projection
vectors that maximize the scatter between the classes
2.1.3. Fault diagnosis while minimizing the scatter within each class.
Several researchers have proposed techniques to Stacking the training data for all classes into the
use principal component analysis for fault diagnosis. matrix X g R n= m and representing the ith row of X
246 L.H. Chiang et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 243–252
with the column vector x i , the total-scatter matrix is 2.2.1. Fault diagnosis
w16,17x While numerous researchers have developed tech-
n niques based on first constructing PCA models based
T
St s Ý Ž x i y x. Ž x i y x. Ž 6. on data collected for each fault class, and then apply-
is1 ing some form of discriminant analysis or related ap-
proach to diagnose faults w4–7,32x, the FDA ap-
where x is the total mean Õector whose elements proach simultaneously uses all of the data to obtain a
correspond to the means of the columns of X. Let the single lower dimensional model used to diagnose
matrix X i contain the rows of X corresponding to faults. The lower dimensional representation pro-
class i, then vided by FDA can be employed with discriminant
T functions, such as the T 2 statistic, to diagnose faults.
Si s Ý Žxiyxi. Žxiyxi. Ž 7.
x igX i
FDA can be used to detect faults by including a class
of data collected during normal process operation.
is the within-scatter matrix for class i where x i is the
mean vector for class i. Let c be the number of 2.2.2. Reduction order
classes, then Akaike’s information criterion ŽAIC. is a well-
c known method for selecting the model order for sys-
Sw s Ý Si Ž 8. tem identification w36x. The AIC contains an error
is1 term and a term which penalizes the model complex-
is the within-class-scatter matrix, and ity. A strength of the AIC is that it relies only on in-
formation in one set of data Žthe training data., un-
c
T like cross validation which requires either additional
Sb s Ý n i Ž x i y x.Ž x i y x . Ž 9. data or a partitioning of the original data set into two
is1
sets. We propose to determine the order of the FDA
is the between-class-scatter matrix, where n i is the model by computing the dimensionality, a, which
number of observations in class i. The total-scatter minimizes the information criterion
matrix is equal to the sum of the between-scatter ma- a
trix and the within-scatter matrix w16x, f Ž a. q Ž 13 .
n
St s S b q S w . Ž 10 . where f Ž a. is the misclassification rate for the train-
Assuming invertible S w , the FDA vectors are deter- ing set by projecting the data onto the first a FDA
mined by computing the singularities of the opti- vectors and n is the average number of observations
mization problem per class. Eq. Ž13., which is similar in form to the
AIC, appears to be reasonable since the penalty term
v T S bv scales relatively well with the error term. This is
max Ž 11 . confirmed later by application.
v/0 v T S wv
Žequations for the case of non-invertible S w are pro- 2.3. Discriminant partial least squares
vided elsewhere w33–35x.. The FDA vectors are equal
to the generalized eigenvectors of the eigenvalue DPLS is a dimensionality reduction technique for
problem: maximizing covariance between the predictor Žinde-
S bw i s l i S w w i Ž 12 . pendent. block X and the predicted Ždependent.
block Y for each component w18–20,37x. DPLS
where the eigenvalues l i indicate the degree of models the relationship between X and Y using a se-
overall separability among the classes. Because the ries of local least-squares fits. In DPLS, the training
direction and not the magnitude of wi is important, data for p classes are stacked into the data matrix
the norm is usually chosen to be 5wi 5 s 1. X g R n= m , where q1 q q2 q . . . qq p s n and qi is
L.H. Chiang et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 243–252 247
the number of observations for class i. There are two where yi g R n is the ith column of Y, Ti g R n= a is
methods, known as PLS1 and PLS2, to model Y. The the score matrix, Bi g R a= a is the regression matrix,
predicted block Y g R n= p in PLS2 is qiT g R a is the loading vector, and f i g R n is the
prediction error vector. Since there are p columns in
1 0 0 PPP 0
. . . . Y, the range of i is from 1 to p.
. . . PPP .
. . . . The most popular algorithm used to compute the
1 0 0 PPP 0 parameters of Eqs. Ž17. and Ž18. in the calibration
0 1 0 PPP 0 step is known as Non-Iterative Partial Least Squares
. . . .
. . . PPP . ŽNIPALS. w11,38x.
. . . .
0
Ys . 1 0 PPP 0 Ž 14 .
.. .
. PPP . PPP .
. . 2.3.1. Fault diagnosis
. .. .
. PPP . PPP . After the parameters have been determined, the
. .
0 0 0 PPP 1 predicted block Ytrain1 of the training set using PLS1
. . . . and the predicted Ytrain2 of the training set using PLS2
. . . PPP .
. . . . are calculated for all orders. In general, the rows of
0 0 0 PPP 1 Ytrain1 and Ytrain2 will not have the form w0,0,0, . . . ,
where each column in Y corresponds to a class. Each 1, . . . , 0,0x; discriminant analysis is needed to predict
element of Y is filled with either one or zero. The class c k at each observation k. One common ap-
first q1 elements of column 1 are filled with a ‘1’, proach is to define c k to be the column index whose
which indicates that the first q1 rows of X are data element has maximum value among row k. This ap-
from fault 1. In PLS1, the algorithm is run p times, proach works well in the ideal case; that is, the clas-
each with the same X, but for each separate column sification is not overestimated nor underestimated
of Y in Eq. Ž14.. This results in one model for each Žoverestimation means the score of an in-class mem-
class. ber ) 1 and the score of a non-class member ) 0,
The matrices X and Y are autoscaled. The matrix while underestimation means the score of an in-class
X is decomposed into a score matrix T g R n= a and member- 1 and the score of a non-class member-
a loading matrix P g R m= a, where a is the PLS 0.. This approach also works well if all of the scores
component Žorder., plus a residual matrix E g R n= m are overestimated or underestimated w20x.
w38x However, if some of the scores are underesti-
mated while others are overestimated, the above ap-
X s TP T q E Ž 15 . proach can give poor results. A method to solve this
In PLS2, Y is decomposed into a score matrix U g problem is to take account of the underestimation and
R n= a, a loading matrix Q g R p= a, plus a residual overestimation of Y into a second cycle of PLS al-
matrix F U g R n= p gorithm w20x. For PLS1 and PLS2, the NIPALS is run
for the second time by replacing Y by Ytrain2 and yi
Y s UQT q F U Ž 16 . by the ith column of Ytrain1 in Eqs. Ž17. and Ž18., re-
The estimated Y is related to X through the score spectively. To distinguish between the normal PLS
matrix T : method and this adjusted method, PLS1 and PLS2
are denoted as PLS1 adj and PLS2 adj , respectively.
Y s TBQT q F Ž 17 . Here the orders for all PLS models are determined
where F is the prediction error matrix. The matrix B using the proposed criterion Ž13..
is selected such that the induced 2-norm of F Žthe Although numerous researchers have proposed
maximum singular value of F w39x, 5 F 5 2 , is mini- fault diagnosis algorithms based on PCA, the PCA
mized w13x. In PLS1, similar steps are performed, re- objective of capturing the most variance is not di-
sulting in rectly related to the objective of fault diagnosis. As
such, the resulting lower dimensional space may con-
yi s Ti Bi qiT q f i Ž 18 . tain little of the information required to discriminate
248 L.H. Chiang et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 243–252
among various faults w21x. Since DPLS exploits fault approaches w5,6,12,14,41–43x. The plant simulator is
information when constructing its lower dimensional based on an actual chemical process where the com-
model, it would be expected that DPLS can provide ponents, kinetics, and operating conditions were
better fault diagnosis than PCA. Since the FDA ob- modified for proprietary reasons Žsee Fig. 1.. The
jective directly coincides with the objective of fault gaseous reactants A, C, D, and E and the inert B are
diagnosis, it would be expected to outperform both fed to the reactor where the liquid products G and H
DPLS and PCA. Using the proposed FDA statistics, are formed. The reactions in the reactor are irre-
and the FDA and DPLS order selection criterion Ž13., versible, exothermic, and approximately first-order
these theoretical predictions are should to be valid for with respect to the reactant concentrations. The reac-
the Tennessee Eastman Industrial Challenge Prob- tor product stream is cooled through a condenser and
lem. then fed to a vapor–liquid separator. The vapor exit-
ing the separator is recycled to the reactor feed
through the compressor. A portion of the recycle
3. Application stream is purged to keep the inert and byproducts
from accumulating in the process. The condensed
The process simulator for the Tennessee Eastman components from the separator ŽStream 10. are
ŽTE. Industrial Challenge Problem was created by the pumped to the stripper. Stream 4 is used to strip the
Eastman Chemical Company to provide a realistic remaining reactants in Stream 10, and is combined
industrial process for evaluating process control and with the recycle stream. The products G and H exit-
monitoring methods w40x. The TE process simulator ing the base of the stripper are sent to a downstream
has been widely used by the process monitoring process which is not included in this process. The
community as a source of data for comparing various simulation code allows 21 preprogrammed major
process faults, as shown in Table 1. The plant-wide well the methods performed, and to determine the ef-
control structure recommended in Lyman and Geor- fectiveness of the model order selection criterion Ž13..
gakis w44x was implemented to generate the closed Each data set started with no faults, and the faults
loop simulated process data for each fault. were introduced 1 and 8 simulation hours into the run,
The training and testing data sets for each fault respectively, for the training and testing data sets. All
consisted of 500 and 960 observations, respectively. the manipulated and measurement variables except
Note that only the training data was used in model for the agitation speed of the reactor’s stirrer for a to-
order selection-the testing data is used to see how tal of m s 52 variables were recorded. The data was
sampled every 3 min, and the random seed Žused to
specify the stochastic measurement noise and distur-
bances. was changed before the computation of the
Table 1
Process faults for the Tennessee Eastman process simulator
data set for each fault. Twenty-one testing sets were
generated using the preprogrammed faults ŽFault 1-
Variable Description Type
21.. In additional, one testing set ŽFault 0. was gen-
IDVŽ1. ArC Feed Ratio, Step erated with no faults. The data were scaled in the
B Composition Constant
ŽStream 4. standard manner before the application of PCA, FDA,
IDVŽ2. B Composition, Step and DPLS. That is, the sample mean was subtracted
ArC Ratio Constant from each variable, and then divided by its standard
ŽStream 4. deviation. All the training and testing data sets have
IDVŽ3. D Feed Temperature Step been made available at http:rrbrahms.scs.uiuc.edu.
ŽStream 2.
IDVŽ4. Reactor Cooling Step
The overall misclassification rate for each mea-
Water Inlet Temperature sure when applied to all disturbances of the testing set
IDVŽ5. Condenser Cooling Step are listed in Table 2. As anticipated by comparing
Water Inlet Temperature their objectives, FDA produced the lowest overall
IDVŽ6. A Feed Loss Step misclassification rate, followed by DPLS, and PCA.
ŽStream 1.
IDVŽ7. C Header Pressure Step
Plots of the misclassification rates of PLS and
Loss — Reduced FDA as a function of model order ŽFig. 2a–c. indi-
Availability ŽStream 4. cate that FDA with any order greater than 10 outper-
IDVŽ8. A, B, C Feed Composition Random Variation forms all of the PLS methods, with most of the sepa-
ŽStream 4.
ration between fault classes by FDA being provided
IDVŽ9. D Feed Temperature Random Variation
ŽStream 2. by the first 13 generalized eigenvectors. This indi-
IDVŽ10. C Feed Temperature Random Variation cates that the superior fault diagnosis provided by
ŽStream 4. FDA is inherent and not due to the model order se-
IDVŽ11. Reactor Cooling Random Variation lection criteria used in this study ŽEq. 13.. Fig. 2a–c
Water Inlet Temperature also indicate that the AIC ŽEq. 13. does a good job
IDVŽ12. Condenser Cooling Random Variation
Water Inlet Temperature
in selecting the model order for both FDA, PLS, and
IDVŽ13. Reaction Kinetics Slow Drift adjusted PLS. For FDA, the AIC captures the shape
IDVŽ14. Reactor Cooling Sticking and slope of the misclassification rate curve for the
Water Valve testing data. In Fig. 2b–c, the AIC curve nearly over-
IDVŽ15. Condenser Cooling Sticking laps the classification rate curves for PLS2 and ad-
Water Valve
IDVŽ16. Unknown
justed PLS2, which indicates that the AIC will give
IDVŽ17. Unknown similar model orders as cross-validation in these
IDVŽ18. Unknown cases. For PLS1 and adjusted PLS1, the AIC does not
IDVŽ19. Unknown overlap with the classification rate curves, but does
IDVŽ20. Unknown have a minimum at approximately the same order as
IDVŽ21. The valve for Stream 4 Constant Position
was fixed at the steady
where the misclassification rate curves for the testing
state position data flatten out. Again, this indicates that the AIC
provides good model orders for the PLS1 methods.
250 L.H. Chiang et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 243–252
Fig. 2. The overall misclassification rates for the training and testing sets and the information criterion ŽAIC. for various orders using FDA,
PLS1, PLS2, PLS1adj , and PLS2 adj and the standard deviation of misclassification rates for the testing set for various orders using PLS1,
PLS2, PLS1adj , and PLS2 adj.
L.H. Chiang et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 243–252 251
tory power. Secondly, different faults tend to create w5x A.C. Raich, A. Cinar, Chemometrics and Intelligent Labora-
different states, and the statistical properties typical of tory Systems 30 Ž1995. 37–48.
w6x A.C. Raich, A. Cinar, AIChE J. 42 Ž1996. 995–1009.
the residual space enables it to contain more discrim- w7x J. Zhang, E. Martin, A.J. Morris, Proc. of the American Con-
inatory power. trol Conf., IEEE Press, Piscataway, NJ, 1995, pp. 751–755.
w8x K.A. Kosanovich, M.J. Piovoso, K.S. Dahl, J.F. MacGregor,
P. Nomikor, Proc. of the American Control Conf., IEEE Press,
4. Conclusions Piscataway, NJ, 1994, pp. 1294–1298.
w9x M.J. Piovoso, K.A. Kosanovich, R.K. Pearson, Proc. of the
American Control Conf., Piscataway, IEEE Press, NJ, 1992,
Fisher discriminant analysis and discriminant PLS pp. 2359–2363.
were shown to be better dimensionality reduction w10x M.J. Piovoso, K.A. Kosanovich, Int. J. Control 59 Ž1994. 743.
techniques than principal component analysis for fault w11x B.M. Wise, N.B. Gallagher, J. Process Control 6 Ž1996.
diagnosis. Although numerous researchers have de- 329–348.
w12x D.M. Himes, R.H. Storer, C. Georgakis, Proc. of the Ameri-
veloped techniques for using PCA to diagnose faults, can Control Conf., IEEE Press, Piscataway, NJ, 1994, pp.
it is not well suited because it does not take into ac- 1279–1283.
count the information between the classes when de- w13x M.H. Kaspar, W.H. Ray, AIChE J. 38 Ž1992. 1593–1608.
termining the lower dimensional representation. FDA w14x W. Ku, R.H. Storer, C. Georgakis, Chemometrics and Intelli-
provides an optimal lower dimensional representa- gent Laboratory Systems 30 Ž1995. 179–196.
w15x H. Tong, C.M. Crowe, AIChE J. 41 Ž7. Ž1995. 1712–1722.
tion in terms of maximizing the separation amongst w16x R.O. Duda, P.E. Hart, Pattern Classification and Scene Anal-
several classes. The projection vectors are ordered in ysis, Wiley, New York, 1973.
terms of maximizing the scatter between the classes w17x R. Hudlet, R. Johnson, in: J. Van Ryzin ŽEd.., Classification
while minimizing the scatter within each class. In and clustering, Academic Press, New York, 1977, pp. 371–
discriminant PLS, the covariance between the predic- 394.
w18x B.K. Alsberg, R. Goodacre, J.J. Rowland, D.B. Kell, Analyt-
tor block Ždata from all classes. and predicted block ica Chimica Acta 348 Ž1997. 389–407.
Žrepresentation of class membership. are maximized w19x M. Defernez, K. Kemsley, Trends in Analytical Chemistry 16
for each factor. Information between classes are used Ž1997. 216–221.
when determining each factor. A model selection cri- w20x J. Nouwen, F. Lindgren, W. Karcher, B. Hansen, H.J.M.
terion for FDA and discriminant PLS were proposed Verharr, J.L.M. Hermens, Environ. Sci. Technol. 31 Ž1997.
2313–2318.
based on the Akaike information criterion. The tech- w21x E.K. Kemsley, Chemometrics and Intelligent Laboratory Sys-
niques were applied to data collected from the Ten- tems 33 Ž1996. 47–61.
nessee Eastman chemical plant simulator, where FDA w22x J.F. MacGregor, T. Kourti, Control Engineering Practice 3
performed the best, followed by DPLS and PCA. Ž1995. 403–414.
w23x N.D. Tracy, J.C. Young, R.L. Mason, J. Quality Control 24
Ž1992. 88–95.
w24x J.E. Jackson, G.S. Mudholkar, Technometrics 21 Ž1979.
Acknowledgements 341–349.
w25x J.L. Horn, Psychometrica 30 Ž2. Ž1965. 179–185.
This work was supported by International Paper. w26x J.E. Jackson, A User’s Guide to Principal Components, Wi-
ley, New York, 1991.
w27x S. Wold, Technometrics 20 Ž1978. 397–405.
w28x W.R. Zwick, W.F. Velicer, Psychological Bulletin 99 Ž3.
References Ž1965. 432–442.
w29x B.M. Wise, N.L. Ricker, D.F. Veltkamp, Upset and sensor
w1x J.V. Kresta, T.E. Marlin, J.F. MacGregor, Can. J. Chem. Eng. failure detection in multivariate processes,Technical report,
69 Ž1991. 35–47. Eigenvector Research, Manson, WA, 1989.
w2x T. Kourti, J.F. MacGregor, J. Quality Technol. 28 Ž1996. w30x R. Dunia, S.J. Qin, T.F. Edgar, T.J. McAvoy, AIChE J. 42
409–428. Ž1996. 2797–2812.
w3x J.F. MacGregor, Proc. of the IFAC Conference on Advanced w31x A. Negiz, A. Cinar, Proc. of the American Control Conf. Pis-
Control of Chemical Processes, Pergamon Press, New York, cataway, IEEE Press, NJ, 1992, pp. 2364–2368.
1994, pp. 427–435. w32x W. Ku, R.H. Storer, C. Georgakis, In AIChE Annual Meet-
w4x A.C. Raich, A. Cinar, Proc. of the IFAC Conf. on Advanced ing, 1993. Paper 149g.
Control of Chemical Processes, Pergamon, New York, 1994, w33x Y.Q. Cheng, Y.M. Zhuang, J.Y. Yang, Pattern Recognition 25
pp. 427–435. Ž1992. 101–111.
252 L.H. Chiang et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 243–252
w34x Z.Q. Hong, J.Y. Yang, Pattern Recognition 24 Ž1991. 317– w40x J.J. Downs, E.F. Vogel, Comput. Chem. Eng. 17 Ž1993.
324. 245–255.
w35x Q. Tian, J. Opt. Soc. Am. A 5 Ž1988. 1670–1672. w41x G. Chen, T.J. McAvoy, J. Process Control 8 Ž1997. 409–420.
w36x L. Ljung, System Identification: Theory for the User, Pren- w42x C. Georgakis, B. Steadman, V. Liotta, Proc. of the 13th IFAC
tice-Hall, Englewood Cliffs, NJ, 1987. World Congress, IEEE Press, Piscataway, NJ, 1996, pp. 97–
w37x B.K. Alsberg, D.B. Kell, R. Goodacre, Analytical Chemistry 101.
70 Ž1998. 4123–4133. w43x A.C. Raich, Proc. of the 13th IFAC World Congress, IEEE
w38x P. Geladi, B.R. Kowalski, Analytica Chimica Acta 185 Ž1986. Press, Piscataway, NJ, 1996, pp. 283–288.
1–17. w44x P.R. Lyman, C. Georgakis, Comput. Chem. Eng. 19 Ž1995.
w39x G.H. Golub, C.F. van Loan, Matrix Computations, Johns 321–331.
Hopkins Univ. Press, Baltimore, MD, 1983.