with contribution limits for a faulty faulty situation after the fault has developed long enough to affect related variables, most of the variable contributions in Figures 8.2(b) and 8.2(d) violate the control limits. The real fault is the drift in glucose feed rate (variable 3), which is highly correlated with glucose concentration (variable 5), dissolved oxygen concentration (variable 6), biomass concentration (variable 7), penicillin concentration (variable 8), culture volume (variable 9), heat generated (variable 13), and cooling water flow rate (variable 14). Note that penicillin concentration (variable 8) in Figure 8.2(b) and dissolved oxygen concentration (variable 6) in Figure 8.2(d) have the highest contributions to SPE and T2 during the out-of-control period. Variable contributions to T2 over all of the variables at each time instant are also presented in Figure 8.3(a) as another indicator for detecting out-of-control
situation. To reveal the root cause of the fault, variable contributions to SPE and T2 are plotted along with the control limits right after the out-of-control situation is detected. Variable contributions to T2 are summed for 270th and 271st measurements in Figure 8.3(b) and are summed for 250th and 251st measurements in the case of SPE in Figure 8.3(c). Both charts indicate that the glucose feed rate (variable 3) has the highest contribution, and therefore is the root cause to the deviation. Second highest contribution is that of the glucose concentration (variable 5) as expected. As a good practice, univariate chart of the variable that has the highest contribution is plotted. Figure 8.3(d) represents the glucose feed rate profile of the faulty batch superimposed on the reference glucose feed rate profile of NO.
All of the aforementioned tasks can be integrated into a real-time know ledge-based system for automated supervision and ease of interpretation. The details of implementation will be discussed and presented in Section 8.4.1.
8.2 Statistical Techniques for Fault Diagnosis 8.2.1 Statistical Discrimination and Classification
Statistical discrimination and classification are multivariate techniques that separate distinct sets of objects (or events), and allocate new objects (or events) into previously defined groups of objects, respectively . Discrimination focuses on discrimination criteria (called discriminants) for converting salient features of objects from several known populations to quantitative information separating these populations as much as possible. Classification sorts new objects or events into previously labelled classes by using rules derived to optimally assign new objects to the labelled classes. A good classification procedure should yield few misclassifications. The probability of occurrence of an event may be greater if it belongs to a population that has a greater likelihood of occurrence. A good classification rule should take these "prior probabilities of occurrence" into consideration. A good classification procedure should also account for the costs associated with misclassification, classification of an event to a different class. Consider two hypothetical sensor faults, one necessitating process shutdown because without measuring and controlling that variable the process may produce hazardous products, and the other causing higher use of utilities. Their misclassification would yield different levels of hazards and damages, hence their costs of misclassification are different.
Consider a data set with g distinct events such as normal process operation and operation under <7 — 1 different faults. The operation type (class) is determined on the basis of p measured variables x = [xi x2 • ■ ■ xp]T that are random variables. Denote the classes by i = 1, • ■ ■ ,g, their prior probability by pt i = 1, • • • ,g and their probability density functions by /., (x). While it is not necessary to assume that /i(x) be the multivariate normal density, in most derivations and in this discussion it will be assumed that it is, with population and sample means ¿t, and x,, respectively and population and sample variances St and S, , respectively. Denote the cost of misclassification as c(k\i), the cost of allocating an object to 7r/e (for k = 1, • • • ,g) when in fact it belongs to 7Tj (for i = 1, • • ■ ,g). If Rk is the set of x's classified as 7rjt, the probability of classifying an event as hi-, when in reality it belongs to 7r, is
P(k\i) — P(classifying event as = / /t(x)<ix i, k = 1, • - • , g
with P(i\i) = 1 — P(k\i). The conditional expected cost of mis-
classification (ECM) of an event in tti to any other class is
Was this article helpful?