Figure 5.18. Experimental setup of the fermentation system.

Figure 5.18. Experimental setup of the fermentation system.

measurement do not have an explicit mutual dependence in model equations. When the classification scheme indicates a Coupled measurements, we will write those measurements in each other's model equations. If the scheme indicates Redundant measurements, we will claim that making a set of measurements for one coordinate yields a remarkable amount of information about the other, hence measuring both, or considering both in a modeling attempt is not necessary.

In the classification scheme, we use the mutual information content between the measurements I normalized to data length3, the fractal dimension of the data d and the correlation coefficient p. The reasoning behind this classification scheme is as follows: If two arrays of data fill the phase space they live on (d > 1), they are likely to be Independent, as dependent

3This normalization is done by changing the base of the logarithm in Eq 5.22 from 2 to the data length N.

Figure 5.19. Sample experimental data.

variables would make a loosely quilted pattern, leaving tracks on the phase space. At the other extreme, if the variables reveal a dense pattern yielding no significant information about each other, (d < 0.5 and I < 0.5), these are considered to be independent. If two arrays of data leave tracks on the phase space by moderately filling it and display a considerable amount of information about each other, and are highly correlated (0.5 < d < 1, I > 0.5 and |p| > 0.6), then one of the two arrays can be discarded in favor of the other, since measuring both would be redundant. For other combinations of I, d and p, two arrays of data will be considered to be Coupled.

Our measurement space is 9-dimensional, and our measurement vector is composed of samples of the vector [X D X S 1Z Q V a 7]T. When we compute the capacity dimension of this signal we find d ~ 2.98. This capacity dimension yields a sufficient embedding dimension of n = 6. On the other hand, due to the statistical fluctuations, we may as well have a capacity dimension that is slightly above 3.0. In such a case, we should be computing an embedding dimension of n = 7. However, this is not the case, as the actual dimension of this signal must be an integer value (3 in this case), due to the assumed non-chaotic nature of the signal. Therefore, we choose n — 6.

This choice of embedding dimension for a 9-dimensional signal implies that at least 3 of the entries in the measurement vector should be discarded. In agreement with this finding, if we look at the eigenvalues of the covariance matrix, (see Section 4.1)

{o-i} = {3.98 x 104,5.10 x 103,9.45 x 101,4.81 x 101,1.70 x 10°,

7.18 x 10-1,1.77 x 10-2,2.77 x 10~3,3.07 x HT4}, (5.40)

we see that the last three eigenvalues are negligible when compared to others. If we perform a principal component analysis and select the first six dominant transformed coordinates, we will have a mean-square error less than 4.63 x 10-5%, which is much less than the square of the radius of error caused by the sensitivity in our measurements which is around 10"2%.

Here, we are not after a best reduced representation, but after the most significant measured quantities in our data set. Naturally, our choice of a subset composed of the most significant measured quantities will yield a higher mean-square error in representing the data set. Nevertheless, it is desired to keep this error level as low as possible. We are to select three of the coordinates out of nine, such that the mean-square error is minimum. This gives us (3) =84 possible ways to choose these three coordinates. To facilitate this process, consider Table 5.3 where we summarize the results of dimension (d), mutual information coefficient (I) and correlation coefficient (p) computations, as well as the outcome of the heuristic classification scheme (Class) on our data pairs. Note that, d, I and p are symmetric quantities, and the order in which they are referenced is immaterial, i.e., d, I and p for VQ are the same as the d, I and p for QV.

The data series pairs US, Q1Z and IX show redundancies, since for all three pairs, 0.5 < d < 1, I > 0.5 and \p\ > 0.6. Therefore, in each pair, one coordinate can be discarded in favor of the other. Thus, we have six possible ways to choose the coordinates to be discarded. Dropping coordinates V, Q and X from the measurement vector results in a reduced covariance matrix, with eigenvalues,

{&i} = {3.98 x 104,5.10 x 103,9.45 x 101,4.81 x 101,1.68 x 10°, 7.12 x 10"1}.

Comparing the eigenvalue sets (5.40) and (5.41), we find that dropping T>, Q and I gives a mean-square error of 2.41 x 10_4%, which is about an order of magnitude greater than that achieved by considering a principal component analysis and representing the 9-dimensional space by the first 6 coordinates. Still, this is much less than the square of our measurement sensitivity radius. We find this proximity satisfactory and reduce the phase space to[X S K V a i\T. This phase space reduction will reduce any such experimental work in the future by 33% for this system.

Hereafter we concentrate on the entries below the double line in Table 5.3, where only the relations between the coordinates of the reduced phase space are considered. Looking at the mutual interactions between

Pair |
d |

Was this article helpful?

## Post a comment