where wa,j denotes the weights, paj the loadings for X block (process variables) of the PLS model, £a>new the scores of new observations and b the vector of regression coefficients.

Multivariate control charts based on squared prediction errors (SPEx and SPEy), biplots of the scores (ta vs ta+i) and the Hotelling's statistic (T2) are constructed with the control limits. The control limits at significance level a/2 for a new independent t score under the assumption of normality at any time interval are

where n, sest are the number of observations and the estimated standard deviation of the score sample at the chosen time interval and in_i>a/2 is the critical value of the ¿-student test with n — 1 degrees of freedom at significance level a/2 [214, 435]. The Hotelling's statistic (T2) for a new independent t vector is calculated as [594]

where S is the estimated covariance matrix of PLS model scores, A the number of latent variables retained in the model and Fa,ti-a the F-distribution value. The control limits on SPE charts can be calculated by an approximation of the x2 distribution given as SPEQ — 9Xh„ [76]. This equation is well approximated as [148, 255, 435]

where g is a weighting factor and h degrees of freedom for the x2 distribution. These can be approximated as g — v/{2m) and h = 2m2/v, where v is the variance and m the mean of the SPE values from the PLS model. All of the aforementioned calculations are illustrated in the following example.

Example. Consider a continuous fermentation process where monitoring will depend on how well the process is performing based on product quality. Assume that ten process variables such as aeration rate and substrate feed rate are used for the X block, and one quality variable, product concentration for Y block. As the first step of the PLS modelling the outliers are removed and both blocks are scaled appropriately (autoscaling is used for this case). A PLS model is built to relate ten process variables with one quality variable. A data window of 100 observations are taken as in-control operation. In order to decide the number of latent variables to be retained in the model, PRESS and CUMPRESS values are calculated based on cross-validation (Figure 6.12).

Only the first two latent variables are used in the monitoring procedure since they explained 88.10% variation in Y (Table 6.2) and the decrease in the CUMPRESS value by adding the third latent variable is small (only an additional 0.83% of the variance of Y). A step decrease (30% off the set point) into substrate feed rate was introduced after 100th observation until 150th observation (Figure 6.12). It is desired to detect this change based on its effects on the quality variable (product concentration). Both

Table 6.2. Percent variance captured by PLS model

X-block

Y-block

LV no. This LV Total This LV Total

SPE (Figure 6.12(c)) and T2 (Figure 6.12(e)) charts for X block have detected this change on time. Biplot of the latent variables also shows an excursion from the in-control region defined by ellipses and the score values come back to the in-control region after the change is over (Figure 6.12(b)). SPE of Y block shows an out-of-control situation as well (Figure 6.12(f)). Although the disturbance is over after 150th observation (Figure 6.12(c)-6.12(e)), product quality seems to deteriorate because the prediction capability of PLS model becomes poor after 150th observation (Figure 6.12(d)) suggesting a change in the quality space which is different than the one reflected by PLS model.

Was this article helpful?

## Post a comment