Statistical Design of Experiments

Experiments are frequently performed to assess the effects of inputs, operating conditions, and changes in the process on the outputs. For example, the effects of variations in fermentation temperature, air flow rate, or strain type used on the attributes of a product would provide valuable information for optimizing productivity. Experiments are costly since they consume time and raw materials. Properly planned experiments minimize unnecessary duplications, generate more information with fewer experiments, and reduce the effects of measurement noise on data used for analysis. Statistically designed experiments provide unambiguous results at a minimum cost and provide information about interactions among variables [213]. An experiment is a means for drawing inferences about the real world and care must be exercised to define the scope of the experiment broad enough to include all conditions that the experimenter wishes to consider.

Most technical personnel focus on generating information from data, an activity that is called statistical analysis. However, equal attention should be given to generate informative data. The process of planning the experiments to generate the maximum amount of information with the minimum number of experiments is called statistical design of experiments (DOE). The objectives of DOE can be:

• To compare a set of treatments to determine whether their effects differ on some response (output) of interest

• To establish cause and effect relationships between outputs (responses, dependent variables) and inputs (factors, independent variables)

• To identify the most important inputs

• To identify interactions between inputs

• To identify improved settings for inputs to optimize the outputs

• To estimate empirical relationships between inputs and outputs.

The amount of data needed to cover this wide range of objectives that include comparison, screening, regression and optimization varies from one process to another. Consequently, the methods selected to design the experiments depend on the objective. Exploratory experiments can be designed as an iterative process where additional experiments are designed based on insight gained from analyzing the results of prior experiments. The literature on design of experiments is quite rich. Some of the popular books include [78, 370, 401]. More sophisticated design and analysis techniques include response surface analysis [77, 407], multivariate design of process experiments [277], and various types of advanced designs used for example in the pharmaceutical industries for drug discovery where very large numbers of configurations must be screened rapidly. The discussion in this section will be limited to screening experiments where the most influential inputs and interactions are determined. Two-level factorial designs are of great practical importance for comparison and screening studies. They are discussed in this section to underline the wealth of information that can be extracted from a process by a proper design and to contrast with the favorite approach of most technical people, the one-variable-at-a-time (OVAT) experimentation.

The OVAT approach involves variation in the level of an input (with levels of all other inputs being fixed) to find the input level that yields an optimal response. This procedure is then repeated for each of the remaining inputs. The OVAT procedure can be carried out for several iterations. The inputs that were varied in previous sets of experiments are kept at levels that gave optimal responses. The OVAT approach necessitates more experiments than the factorial design based experimental plans. Experiments must be duplicated and the results must be averaged to reduce the effects of measurements errors. This increases further the number of experiments conducted based on the OVAT approach. As illustrated later, the averaging process is an integral part of the analysis of data collected by factorial design based experimental plans. Furthermore, the OVAT approach does not provide information on the impact of the interaction of inputs on the response. Consequently, the OVAT approach must be avoided as much as possible.

Design of experiments to collect data for building empirical dynamic models of processes is another challenging problem. Here the focus is on designing input sequences that have specific characteristics so that the process is excited properly to generate data rich in information. This problem has been studied in the systems science, system identification and process control communities. The interested reader is referred to [346, 558].

3.3.1 Factorial Design

In any process, there may be a large number of input variables (factors) that may be assumed a priori to affect the process. Screening experiments are conducted to determine the inputs and interactions of inputs that influence the process significantly. In general the relationship between the inputs and outputs can be represented as y = f(xi,x2,- ■■ ,xp) +e . (3.1)

where Xi,i = 1 : p are the factors, (e) is the random and systematic error and y is the response variable. Approximating this equation by using Taylor series expansion:

y = b0 + b ixi + b2x2 + ■ ■ ■ + bpxp + bi2x\x2 + ■■■ + b

+bjpxjxp H-----h bnx\ H-----h biixf H-----h bppx* (3.2)

+ Higher Order Terms + e a polynomial response surface model is obtained where bi denotes the parameters of the model. The first task is to determine the factors (x») and the interactions (xixj, XiXjXk and higher order interactions) that influence y. Then, the coefficients like b,, bu. bl:jk of the influential inputs and interactions are computed. These parameters of the response surface models can be determined by least squares fitting of the model to experimental data. Several decisions have to be made before designing the experiments. A detailed summary of the decision making process is discussed in [88].

This section presents the two-level factorial design approach to select the conditions for conducting screening experiments that determine the significant factors and interactions. To perform a general factorial design, the investigator selects a fixed number of levels (two in most screening experiments) for each factor and then runs experiments with all possible combinations of levels and variables. If there are p factors, 2P experiments must be conducted to cover all combinations. The number of experiments to be conducted grows rapidly with increasing number of factors. While 8 experiments are needed for 3 factors, 64 experiments are necessary for 6 factors. The factors may be continuous variables such as substrate feed rate (R) or bioreactor temperature (T) or discrete variables such as the strain (S) of the inoculum. The low and high levels of continuous variables may be coded using the — and + signs or 0 and 1, respectively. Qualitative (discrete) variables limited to two choices are coded using the same nomenclature. The levels of inputs to be used in each experiment are listed in a design matrix (Table 3.1).

Two-level factorial designs are appealing for a number of reasons. They require a few experiments to indicate major trends in process operation and determine promising directions for further experiments. They form the basis for two-level fractional factorial designs. They can be readily augmented to form composite designs, hence they are building blocks to construct efficient data collection strategies that match the complexity of the problem studied. The results of the experiments can be interpreted using simple algebra and computations. The interpretation of experimental results by discovering the significant factors and interaction effects is illustrated below by an example.

Example. A set of screening experiments are conducted in a laboratory scale fermenter and separation system to determine the effects of substrate

Table 3.1. Three alternative notations for 23 full factorial designs

Run

Was this article helpful?

0 0

Post a comment