## Info Chosen Class

Chosen Class 100 150

### Time index

Figure 8.4. Detection and diagnosis of process upsets (a) Detection of outliers based on residuals, (b) detection based on T2 test of scores, (c) diagnosis statistics considering each possible disturbance, (d) index of chosen disturbance for each observation .

100 150

### Time index

Figure 8.4. Detection and diagnosis of process upsets (a) Detection of outliers based on residuals, (b) detection based on T2 test of scores, (c) diagnosis statistics considering each possible disturbance, (d) index of chosen disturbance for each observation .

Residual Discriminant. Assuming that observations will not be well described by PC models for other faults but will be within the residual threshold of their own class, it is most likely that x is from the fault model i with minimum

Ti is the residual computed using the PC A model for fault i and rin is the residual threshold at level lOOo based on the PC A model for fault i.

Combined Discriminant. Combining the information available in scores and residuals usually improves the diagnosis accuracy . Comparing the combined information to the confidence limits of each fault model, x is most likely to be from the fault model i with minimum where Sj and n are the score distance and residual based on the PC model, respectively, for fault i, SitCc and are the score distance and residual thresholds using the PC model, respectively, for fault i, and c, is a weight between 0 and 1. To weigh scores and residuals according to the amount of variation in data explained by each, c* is set equal to the fraction of total variance explained by scores. The combined discriminant value thus calculated gives an indication of the degree of certainty for the diagnosis; statistics less than 1 indicate a good fit to the chosen model. If no model results in a statistic less than 1, none of the models provide an adequate match to the observation.

The FDD system design includes development of PC models for NO and faulty operation, and computation of threshold limits using historical data sets collected during normal plant operation and operation under specific faults. The implementation of the FDD system at each sampling time starts with monitoring. The model describing NO is used with new data to decide if the current operation is in-control. If there is no significant evidence that the process is out-of-control, further analysis is not necessary and the procedure is concluded for that measurement time. If score or residual tests exceed their statistical limits, there is significant evidence that the process is out-of-control. Then, the PC models for all faults are used to carry out the score and residuals tests, and discriminant analysis is performed by using PC models for various faults to diagnose the source cause of abnormal behavior.

Discrimination and Diagnosis of Multiple Disturbances

In fault diagnosis, where process behavior due to different faults is described by different models, it is useful to have a quantitative measure of n/n,a where r» = tf (I - PPT)t*

similarity or overlap between models, and to predict the likelihood of successful diagnosis. In comparing multivariate models, much work has been reported for testing significant differences between means when covariance is constant. Testing for differences in covariance is more difficult yet crucial; diagnosis can be successfully done, whether or not means are different, as long as there is a difference in covariance . Testing for eigenvalue models of covariance adds new complications, since the statistical characteristics are not well known, even for common distributions. Simplifying assumptions for special cases can be made, with significant loss of generality .

Angles Between Different Coordinate Systems and Similarity Index. Raich and Cinar proposed a method based on the angles between principal coordinate directions of current data and regions corresponding to operation with different faults . The method uses angles between different coordinate systems and a similarity index defined by using the angle information .

The similarity index has a range from 0 to 1, increasing as models become more similar. It provides a quantitative measure of difference in covariance directions between models and a description of overall geometric similarity in spread. The similarity index can be used to evaluate discrimination models by selecting a threshold value to indicate where mistakes in classification of data from the two models involved may occur. It can also be used to compare models built from different operating runs of the same process for monitoring systematic changes in process variation during normal operation. Another possible application is in batch processes, where use of the similarity index could provide a way to check if PC model orientation around a moving mean varies in time.

Overlap of Means. The other important statistical test in comparing multivariate models is for differences in means. This corresponds to comparison of origin of coordinates rather than the coordinate directions. Many statistical tests have been developed for testing means, but most of them can become numerically unstable when significant correlation exists between variables. In order to work around the instability, overlap between eigenvalue-based models can be evaluated. Target factor analysis can assign a likelihood on whether a candidate vector is a contributor to the model of a multivariate data set. A statistic is defined to test if a specific vector is significantly inside the confidence region containing the modeled data . For overlap of means, the test can determine whether the mean from one model, fii, significantly overlaps the region of data from another (second) model . Mean overlap analysis can be used to test if an existing PC model fits a new set of observations or if two PC models are analogous.

Comparison of models for individual faults and their combinations can provide information for extending the diagnosis methods to multiple simultaneous faults and masking of contributing faults. If there is no overlap between regions spanned by two different faults, two alternative schemes might handle multiple faults modeled by PCA. In one method, the combination fault is idealized as being located between the regions of the underlying component faults; allocations of membership to the different independent faults contributing to the combination may provide diagnosis of underlying faults. The second method is based on a more general extension of the discrimination scheme by introducing new models for each multiple-fault combination of interest. The measures of similarity in model center and direction of spread can be useful to determine the independence of the models used in diagnosis.

Masking of Multiple Faults. When the region spanned by the model for one (outer) fault contains the model for another (inner) fault, their combination will not be perfectly diagnosed. Idealizing the two fault regions as concentric spheres, the inner model region is enveloped by the outer model. As a result, only the outer fault will be diagnosed and the inner fault will be masked. Overlap of regions is likely to exist for most processes under closed-loop control, the multiple fault scenario is further complicated for such processes.

Random variation faults such as excessive sensor noise move a process less drastically off-target than step or ramp faults. Consequently, similarity measures should indicate that the random variation faults have more overlap with other models, particularly with each other. Ramp or step faults tend to be the outer models, this is consistent with moving the process off its control target or NO. As outer model, ramp or step fault masks secondary random variation faults.

Similarity measures serve as indicators of the success in diagnosing combinations of faults. They can identify combinations of faults that may be masked or falsely diagnosed, and provide information about the success rates of different diagnosis schemes incorporating single and combinations of faults. Using these guidelines, multiple faults occurring in a process can be analyzed a priori with respect to their components, and accommodated within the diagnosis framework described earlier.