# Info Figure 5.12. Two projections of the chaotic orbit of the autocatalysis system.

the system, as accurately as possible, using the information in the (usually scalar) time series. This reconstruction will result in vectors in an m-dimensional space that unfolds the structure the orbits follow in the multidimensional phase space. Therefore, the focus now is, how to choose the components of the m-dimensional vectors, and of course, how to determine the value of m itself.

The answer to this lies in a combination of concepts in dynamics about nonlinear systems as generators of information, and in geometry ideas about how one unfolds an attractor using coordinates established on the basis of their information content. The result of this operation will be a set of Tridimensional vectors, that replace the original scalar data we have filtered.

Although multidimensional measurements are becoming more common because of the wide availability of computer driven data acquisition systems, such measurements do not often cover all degrees of freedom of the underlying dynamics. Furthermore, scalar measurements still constitute the majority of the recorded time series data.

The most commonly used phase space reconstruction technique utilizes the so called delay coordinates. If we represent the measured scalar time series by {yCt, then we can reconstruct a phase space using x= [yi-(m-l)T,yi-(m-2)T, ■ ■ ■ ,yi}T (5.21)

where m is called the embedding dimension, and r the time delay. The embedding theorems of Takens  and Sauer et al.  show that, under some conditions, if the sequence {yi} is representative of a scalar measurement of a state, and m is selected large enough, the time delay coordinates provide a one-to-one image of the orbit with the underlying dynamics.

Example 6 Time delay representation of the blood oxygen concentration signal

Consider the blood oxygen concentration (measured by ear oximetry) data set recorded from a patient in the sleep laboratory of the Beth Israel Hospital in Boston, Massachusetts  (Figure 5.13.a). The data were a part of the Santa Fe Institute Time Series Prediction and Analysis Competition in 1991, and belonged to a patient with sleep apnea. The data were collected with the patient taking a few quick breaths and then stopping breathing for up to 45 seconds. If we could develop a viable low-dimensional model for the system, we could predict stoppage of breathing from the preceding data, which would be a medically significant application.

Consider a time delay representation of the blood oxygen concentration in two dimensions. If we select a time delay of 2 seconds, the autocorrelation of the data would overshadow the representation (Figure 5.13.b). Selecting a larger time delay of 25 seconds would present a more spread signal in the reconstructed phase space (Figure 5.13.c).

It is apparent that, if the frequency of measurements is higher than the dynamical fluctuations of the system, choosing a too small time delay would result in a highly correlated state variables. On the other hand, since our data set is of finite length, we cannot have a too large time delay. Thus, there should be an optimum way of selecting this parameter. Furthermore, if we select a low dimensional reconstruction space, the orbits would intersect, and we would not be able to untangle the dynamics. Many authors point out that it would be safe, in terms of representing the dynamics in a multidimensional phase space, if we select a large enough dimension, m. However, since our goal is to model the dynamics (as opposed to its representation), we should seek the smallest possible m, that would untangle the dynamics. Then again, there should be an optimum way of selecting m.

First, consider the choice of time delay r. If we were to make multivariate measurements on the patient of the previous example, we would prefer measuring his heart rate, rather than his blood oxygen concentration measured from his arm. In other words, we would not like to measure closely related (or correlated, in mathematical terms) quantities. Based on this idea, some authors (c.f. ) propose the least time delay that minimizes the correlation between y{t — r) and y(t). Others  argue that, since the underlying dynamics is nonlinear, and the correlation coefficient is a linear concept, we should be selecting the time delay by monitoring the mutual information content of the series y(t — r) and y(t), which quantifies the amount of information gathered (in bits) about signal y{t) by measuring y(t — t). The mutual information content between these signals is defined as,