0% found this document useful (0 votes)
10 views14 pages

Biotechnology Progress - 2013 - Kumar - Design of Experiments Applications in Bioprocessing Concepts and Approach

The document discusses the application of Design of Experiments (DOE) in bioprocessing to efficiently evaluate multiple process variables and their interactions. It compares various DOE designs, including I-optimal and D-optimal, with traditional methods, providing a systematic methodology for model construction and response prediction. The study emphasizes the importance of selecting appropriate designs to enhance the accuracy of bioprocess outcomes through three case studies involving different chromatography techniques.

Uploaded by

p25vagish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views14 pages

Biotechnology Progress - 2013 - Kumar - Design of Experiments Applications in Bioprocessing Concepts and Approach

The document discusses the application of Design of Experiments (DOE) in bioprocessing to efficiently evaluate multiple process variables and their interactions. It compares various DOE designs, including I-optimal and D-optimal, with traditional methods, providing a systematic methodology for model construction and response prediction. The study emphasizes the importance of selecting appropriate designs to enhance the accuracy of bioprocess outcomes through three case studies involving different chromatography techniques.

Uploaded by

p25vagish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Design of Experiments Applications in Bioprocessing: Concepts and Approach

Vijesh Kumar, Akriti Bhalla, and Anurag S. Rathore


Dept. of Chemical Engineering, Indian Institute of Technology, IIT Delhi, Hauz Khas, New Delhi 110016, India

DOI 10.1002/btpr.1821
Published online October 7, 2013 in Wiley Online Library (wileyonlinelibrary.com)

Most biotechnology unit operations are complex in nature with numerous process varia-
bles, feed material attributes, and raw material attributes that can have significant impact
on the performance of the process. Design of experiments (DOE)-based approach offers a
solution to this conundrum and allows for an efficient estimation of the main effects and the
interactions with minimal number of experiments. Numerous publications illustrate applica-
tion of DOE towards development of different bioprocessing unit operations. However, a
systematic approach for evaluation of the different DOE designs and for choosing the opti-
mal design for a given application has not been published yet. Through this work we have
compared the I-optimal and D-optimal designs to the commonly used central composite and
Box–Behnken designs for bioprocess applications. A systematic methodology is proposed for
construction of the model and for precise prediction of the responses for the three case stud-
ies involving some of the commonly used unit operations in downstream processing. Use of
Akaike information criterion for model selection has been examined and found to be suitable
for the applications under consideration. V
C 2013 American Institute of Chemical Engineers

Biotechnol. Prog., 30:86–99, 2014


Keywords: design of experiments, bioprocessing, Plackett–Burman, central composite, Box–
Behnken, D-optimal, I-optimal

Introduction rial16–18 have been commonly used. Recently, I-optimal (IO),


D-optimal (DO), and other computer generated designs have
Design of experiments (DOE) is an approach that involves also found applicability in process development.19–22 For
systematic and efficient examination of multiple variables example, DO has been used for defining irregular design
simultaneously to create an empirical model that correlates space where conventional designs cannot be used.22
the process responses to the various factors (process varia-
Although the use of DOE for development and optimiza-
bles and material attributes). The experimental design is
tion of biotech processes has significantly increased over the
used to minimize the relative variance in estimation of
last two decades, most publications present the outcome of a
model parameters followed by the statistical analysis to filter
DOE for a given application.23–29 A thorough examination
out the actual values from the various errors that exist in the
of the different designs for biotech applications has not been
system. Both the steps are closely interlinked as the method
reported yet. This article aims to examine the suitability of
of analysis and its reliability directly depend on the design.
the different designs for biotech applications. Guidance has
DOE has been shown to be far more efficient and effective
been provided for selection of the appropriate design that
than the traditional one factor at a time approach.
would yield the precise prediction of the response. Three
In most cases, in particular in bioprocessing, there is a case studies involving bioprocess unit operations have been
large number of factors that can affect the responses and for used to elucidate the underlying concepts.
such cases, DOE is used in two stages. First stage is called
as screening and involves identification of factors that have
statistically significant effect on the process. The second Material and Methods
stage involves prediction of the response surfaces and finding Anion exchange chromatography (case study I)
the optimal set points. Elimination of insignificant factors in
the first step helps in reduction of experimental effort A 96 well plate Q-Sepharose 6ml (GE-healthcare) was
required in the second step. Screening designs such as Plack- used to purify therapeutic protein granulocyte colony-
ett–Burman (PB),1–4 two level factorial,1–3 fractional facto- stimulating factor (GCSF). Standard protocol was followed
rial,1–3,5–8 and in some cases supersaturated designs9 have to perform the process.30 Different elution conditions were
been widely used. For precise prediction of the response sur- used to optimize step recovery. Factors examined in this
face, designs including the Box–Wilson central composite study included pH (7, 7.75, and 8.5), buffer molarity (20, 35,
(CC),10–12 Box–Behnken (BB),13–15and three level full-facto- and 50 mM), and protein loading concentration (20, 35, and
50 mg/mL). The buffer used for equilibration, washing, and
Correspondence concerning this article should be addressed to A.
elution step was Tris-Cl. The salt used for elution was 1 M
Rathore at [email protected]. NaCl.

86 C 2013 American Institute of Chemical Engineers


V
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Biotechnol. Prog., 2014, Vol. 30, No. 1 87

Table 1. DOE designs that were examined in this article and their average variance of prediction
Design Points
A B C Case I Y Case II Y Case III Y Case IV Y FF CC BB DO 16* IO 14* DO 14*
21 21 21 68 28.6 69.4 65.9 冑 冑 冑 冑 冑
21 21 0 62.2 43.3 76.2 65.3 冑 冑 冑
21 21 1 51.7 51.5 76.7 65.1 冑 冑 冑 冑
21 0 21 29.4 48.6 74.2 69.4 冑 冑
21 0 0 27.5 53.9 77.5 63.7 冑 冑 冑 冑
21 0 1 19.5 55.5 74.4 62.9 冑 冑 冑
21 1 21 26.7 46.9 81.6 70.4 冑 冑 冑 冑 冑
21 1 0 18.6 49.1 90.6 71.1 冑 冑
21 1 1 14.2 48.0 84 64.7 冑 冑 冑 冑 冑
0 21 21 91.5 55.5 105.8 67.9 冑 冑 冑
0 21 0 71.8 57.7 106.2 65.5 冑 冑
0 21 1 61.4 62.3 106.8 61.6 冑 冑 冑 冑
0 0 21 79.4 67.5 97.9 68.4 冑 冑 冑 冑
0 0 0 – – 100.5 67.6 冑 冑
0 0 0 68.1 63.2 95.5 66.8 冑 冑 冑 冑 冑
0 0 0 66.0 63.7 96.4 68.4 冑 冑 冑 冑
0 0 1 62.8 63.9 101.9 65.8 冑 冑 冑
0 1 21 65.9 75.6 93 65.2 冑 冑
0 1 0 57.9 74.5 95.6 63.5 冑 冑 冑 冑 冑
0 1 1 47.3 67.0 97.7 64.2 冑 冑
1 21 21 101.4 67.3 88 54.5 冑 冑 冑 冑
1 21 0 74.4 73.1 83.5 51.1 冑 冑 冑 冑
1 21 1 57.9 69.2 80.5 47.8 冑 冑 冑 冑
1 0 21 75.9 61.7 87.9 57.1 冑 冑 冑 冑
1 0 0 47.9 65.0 86.4 55.4 冑 冑 冑
1 0 1 54 65.0 81.7 56.5 冑 冑 冑 冑
1 1 21 75.7 66.6 77.6 50.2 冑 冑 冑 冑
1 1 0 67.9 68.8 71 47.8 冑 冑 冑 冑
1 1 1 85.6 62.6 69.9 45.3 冑 冑 冑 冑
RSM design average variance of prediction 0.22 0.37 0.38 0.47 0.42 0.67
Y is the response for the designs full factorial (FF), central composite (CC), Box–Behnken (BB), D-optimal with 16 and 14 points (DO 16 and DO 14)
and I-optimal with 14 points (IO 14).
*Not unique designs: Design is selected from few generated design based on average variance of prediction. IO designs have more points at the cen-
ter compared to DO.

Cation exchange membrane chromatography (case study Reversed phase high performance liquid chromatography
II) (RP-HPLC) and size exclusion high performance liquid
Acrodisc Mustang-S 0.18 mL membrane (Pall life scien- chromatography (SE-HPLC) analysis of GCSF
ces) was used to purify GCSF. The process was performed Concentration of GCSF in chromatography outputs was
in steps of equilibration, binding of protein, washing, and determined using RP-HPLC using a 4.6 3 150 mm Zorbax
elution.31 Different elution conditions were examined to Eclipse XDB C18 column (Agilent Technologies, Palo Alto,
maximize the yield. Factors examined in this study were pH CA) with a Dionex Ultimate 3000 LC system. The mobile
(4.85, 5.27, and 5.7), buffer molarity (20, 30, and 40 mM), phase consisted of 0.1% (v/v) TFA in water (Solvent A) and
and protein loading concentration (5, 7.5, and 10 mg/mL). 0.1% (v/v) TFA in 98% of acetonitrile (Solvent B). Flow
The buffer used for equilibration, washing, and elution step rate was maintained at 1 mL/min using a linear gradient of
was sodium acetate. The salt used for elution was 1 M A to B at a wavelength of 214 nm.
NaCl. Experiments were performed on an Akta Purifier (GE- Aggregates associated with GCSF were analyzed by ana-
Healthcare). lytical size exclusion chromatography using a 7.8 3 3000
mm and 5-mm particle size column (TSKgelG3000SWXL
Refolding of GCSF (case studies III and IV) from Tosoh Bioscience, Stuttgart, Germany) and were
detected by UV diode array detection at 215 nm.
After harvest, the cells were disrupted and the inclusion
bodies (IB) were separated from cell debris and the soluble
cell components. IB were solubilized in 8 M urea at pH 12 Design generation and model fitting
and allowed to refold using the dilution method.18 The solu- Design generation, data analysis, and model construction
R
bilized IB were diluted 20 times in refolding buffer contain- were performed with JMPV software from SAS. A 333 Full
ing 25 mM Tris, 0.6 M Arginine, 1 mM EDTA, and 5% factorial (FF) with two center points, CC, BB, I-optimal (IO
sorbitol. Dilution was carried out in 20 min followed by 14), and D-optimal (DO14, DO16) designs were generated
addition of cystine (prepared in 0.6 N HCL) and cysteine to as shown in Table 1. As DO and IO are not unique designs,
a final concentration of 0.36 and 1.8 mM to allow protein to design set was chosen from the number of trials algorithm
refold. Optimization of refolding temperature (10, 15, and so as to have points as a subset of FF. Same values were
20 C), pH (7, 8, and 9) of refolding buffer and cystine to used for FF, CC, BB, IO 14, DO14, and DO16 designs if the
cysteine ratio (3:1, 4.5:1, and 6:1) was prepared. points were similar. Replicates were used at the center of
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
88 Biotechnol. Prog., 2014, Vol. 30, No. 1

design to estimate pure error. It was assumed that pure error statistical criterion. They allow parameters to be estimated
variance stays the same throughout the design and so no rep- without bias and with minimum-variance. In comparison to
licates were used other than the center points. the conventional designs, IO and DO designs have no
All the input parameters were coded as 21, 0, and 1 for restraint on the number of experimental runs or the shape of
low, mid, and high level settings, respectively. For Case design space. If the average prediction variance is minimized
studies I and II, factors were named as pH: A, buffer for a design then IO designs are obtained. Another alterna-
strength: B, and protein concentration in feed: C. For Case tive is to have the smallest possible confidence interval for
studies III and IV, pH: A, temperature: B, and cysteine to the model parameters (same as minimizing the variance of
cystine ratio: C. Process response for Case studies I, II, and regression coefficients of the model). This can be achieved
III was yield and for Case study IV it was purity. by minimizing the determinant |XX’|21 in DO designs. DO is
suitable for first-order models and for screening when the
aim is to identify significant factors. Conversely, IO designs
Theory are chosen for calculation of response surfaces as then the
precise estimation of the response becomes more important
DOE modeling
than the precise estimation of the parameters. The design
Factorial designs are frequently used to identify the main points of IO will be more at the center of design space,
effects as well as interactions amongst the various factors. whereas DO will be at the edges. These designs are not
For quantitative factors, the data can be represented through unique as they depend on the initial value used in computer
the commonly used “linear regression model.”1 For two fac- algorithms, for example, point exchange32 or coordinate
tors, it can be represented as exchange algorithm.33
y 5 b0 1 b1 x1 1 b2 x2 1 b1 2 x1 x2 1 e (1) Response Surface Designs. Response surface designs
involve just main effects and interactions. They may also
where, b’s are the regression coefficients. This first-order have quadratic and possibly cubic terms to account for cur-
model can be generalized to a higher order model by addi- vature. If the number of observations to be made is 16 then
tion of terms containing higher powers of x. In general, IO design is CC (for the group of star points on the faces of
method of least square is used to estimate b ^ with the design space cube). For the special case of three factor
assumption that expected value and the variance of the error designs, if the factor constraints are such that only two of
(e) are E(e) 50 and V(e) 5 r2, respectively. In matrix nota- them take the extreme level value (21, 1), for example,
tion, the model can be represented as (21, 1, 0), (21, 0, 21), (0, 21, 1) and so forth, then those
designs are BB. BB designs are possible only for cases when
y5Xb 1 e (2)
we have three or more factors. Combination of CC and BB
where y, b, and e are the column matrices of (n 3 1), (p 3 designs (for three levels and three factors) yields a three
1), and (n 3 1) vectors, respectively, X is a (n 3 p) matrix, level FF design involving 27 experiments.
and n is the number of observations. Further, p is the number All of these designs can be evaluated using average pre-
of parameters in the model. The method chooses b ^ so that diction variance and variance inflation factor.21,34 As the
the sum of squares of the error e is minimized. The least actual error variance is not known before the experimental
squares estimate of b is then given by runs, re is assumed to be equal to one and for this case,
Eq. 5 gives the relative error variance, which can be used
^ ðX 0 XÞ21 X 0 y
b5 (3) for design evaluation.
And, the fitted regression model is
Construction and evaluation of the model
^
^y 5X b (4)
The aim of DOE modeling is to represent experimental
To evaluate the design and model statistically, it is neces- data by an empirical relation with a minimum numbers of
sary to estimate the variance (r2). The variance–covariance statistically significant parameters. For example, some of b’s
matrix for b^ is of the model represented by Eq. (1) may be negligibly small.
21
This can be tested by the t-test as follows
^
CovðbÞ5r 0
e ðX X Þ (5)
bi 2b i
where, the estimate of re 2 is given by t5 pffiffiffiffiffiffiffiffiffi  (7)
Cov b ^
i
1  
^ 0 y2X b
^

^ 2e 5
r y2XbÞ (6)
n2p This ratio follows t-distribution with the random variable
mean value (b i ) is zero. This can be quantified by the P-
The numerator in Eq. 6 is the sum of squares of error value (typically <0.05 signals statistical significance). Fur-
(SSE). It is a measure of the discrepancy between the data ther, significance of the full model can be tested by calculat-
and an estimation model. ing the ratio of mean square of the model versus the mean
square of the error (F-ratio).
DOE designs The model represented by Eq. 1 is a linear model and
only significant factors and their interactions can be deter-
Many different DOE designs have been used in the litera- mined. If the factors are expected to follow a complex
ture. In this article, we focus on FF, IO, DO, and response behavior in the design space, response surface model is more
surface designs. appropriate. This model can be represented by quadratic,
IO and DO Designs. Optimal designs are a class of cubic, or their reduced forms with more number of parame-
experimental designs that are optimal with respect to some ters. As the number of parameter increases, the number of
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Biotechnol. Prog., 2014, Vol. 30, No. 1 89

experimental runs also increases. This becomes a major con- AIC 5 22logðlikelihoodÞ 1 2k (8)
straint as the cost of development of the model increases.
where k 5 number of estimated parameters in the model
Model construction is a compromise between bias and var- including the error term and intercept. In the special case of
iance versus the number of parameters that can be estimated least squares with normally distributed errors, log(likelihood)
in the model (the principle of parsimony).35 As the number can be approximated to
of parameters increases and the degrees of freedom are kept
same, the bias decreases but the variance increases. By spar- logðlikelihoodÞ5 21=2nðlnðSSE=nÞ 1 lnð2pÞ 11Þ (9)
sity principle, it can be expected that most of the variability
observed in the process output will be dominated by a few where n is the number of observation. When k is large rela-
factors. This principle is used in screening DOE designs tive to n, second-order bias corrected version AICc is pre-
such as PB and supersaturated designs.36 For response sur- ferred39 over AIC.
face models, it is also conceivable that only one or two fac- AICc 5 2logðlikelihoodÞ 1 2k 1 2kðk11Þ=ðn2k21Þ (10)
tors have quadratic or cubic function. So, it may be possible
to design experiments based on less number of parameters Given a set of potential models, the preferred model is the
and thereby fewer numbers of experimental runs will be one with the minimum AICc value, where it is expected that
required. If prior knowledge about the effect of factors does least information will be lost in approximating data to the
not exist then this will be a constraint. Alternatively, with selected model. Individual AIC values are not good indica-
the available degrees of freedom, we can fit an tors as they are affected by sample size and constants. The
“approximate” model with the previously designed experi- better way is to compare the values from the minimum AIC
ments. Though, this may not be a “true model” as the actual in a model. As a rule of thumb, if the difference between the
effects of individual factors will be biased with one another, minimum AICc (among all potential models) and that for
the model may still be useful for making predictions. This competing models is less than 10 then one model cannot be
can be done by maximizing the fit by inclusion of terms preferred over other.33
which were not initially incorporated. This may lead to cor- Bayesian information criterion (BIC) is also based on the
relation among the estimated parameters and at the same likelihood function and is closely related to AICc but is
time better fit of the data and thus, more precise predictions. based on Bayes factor. When fitting models, it is possible to
Evaluation of Model. The coefficient of determination, increase the likelihood by adding parameters, but doing so it
R2(SSmodel/SStotal), is a measure of unexplained or residual may result in over-fitting. BIC resolves this problem by
variability in the data as a percentage of the mean of the introducing a penalty term for the number of parameters in
response variable. “Adjusted R2” corrects R2 for addition of the model. The penalty term is larger in BIC than in AIC
terms to the model. Predicted residual sums of squares and is given by40
(PRESS) measures how well the model is likely to predict the BIC 52logðlikelihoodÞ1 klnðnÞ (11)
response in a new experiment. By adding more parameters in
the model to explain the obtained responses, we may tend to The best model will be having smallest criterion value.
over-fit the data along with the errors. In general, if the num-
ber of parameters are increased in a model to better fit data,
PRESS value will decrease and R2 value will increase. So, Results and Discussion
these may not be the true indicators in such cases. A better
This article aims to present an approach for fitting data to
indicator would be the root mean square of errors (RMSE). It
the simplest possible model without missing any higher order
is a frequently used measure of the differences between values
term within the available degrees of freedom. It has been
predicted by a model or an estimator and the values actually
stated that “All models are wrong but some are useful” and
observed. If the value of RMSE is not close to the run to run
that “model selection is best seen as a way of approximating,
variability then this may mean that the model is not accurate.
rather than identifying the full reality.” We wish to identify
Using correlated terms for model building makes selection a design that is efficient (as determined by the number of
of parameters difficult as adding or dropping them will influ- runs) and accurate.
ence the significance of other parameters. In addition to this,
The four datasets chosen here represent a wide range of
there will be competing models for which PRESS, R2, and
variability of responses observed in biotech processes. Case
statistical significance will be close. This will make it diffi-
study I represents a robust chromatography step, whereas
cult to choose one model over other. To have better statisti-
Case study II is a chromatography step operated under high
cal inference, an information criterion can be used along
noise condition (high dead volume of equipment compared
with the above mentioned criteria.
to the scale of membrane used). Case studies III and IV are
Information Criterion for Model Selection. For linear about a refolding step and hence are likely to exhibit more
regression method with a finite number of terms, the bias variance in the data. Although Case study III has yield as
corrected version of Akaike information criterion (AICc) can the process response, Case study IV has purity as the process
be used to differentiate competing statistical models. Akaike response.
proposal37,38 is based on the expected Kullback–Leibler (K- For all analyses, datasets were first checked for normal
L) information. The best model will lose the least informa- distribution. Model parameters and their combinations were
tion relative to full reality. The full reality can never be real- chosen using stepwise regression. A more detailed descrip-
ized in the true sense and hence model selection criterion is tion of stepwise fitting and its limitations can be found else-
of minimizing expected estimated K-L information over a where.41 Inclusion of parameters in the model was based on
set of competing models. “Prob > F” test (P < 0.05), AICc, BIC, R2, and R2Adj . Possi-
AIC is based on relationship between the K-L information ble models and its construction approaches were tested with
and the likelihood theory. It is given by numbers of trials using backward elimination, P value
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
90 Biotechnol. Prog., 2014, Vol. 30, No. 1

Table 2. Model Summary for Case Studies I, II, and III for the Three Traditional Designs Using Method 1
Design R2 R2Adj Significant Model Terms (P < 0.05) ^
Max. Y Set Point A
Case I FF0 0.90 0.86 A, B, C, AB, BC, AA, BB 88.9 6 8.2 0.308
CC 0 0.91 0.82 A, B, AB, BC, AA, CC 74.1 6 10.3 0.477*
BB 0 0.96 0.93 A, B, AB, AA, CC 88.6 6 9.9 0.215*
Case II FF0 0.84 0.79 A, B, C, AB, AC, BC, AA 71.0 6 3.5 0.5
CC 0 0.69 0.65 A, AA 67.9 6 5.0 0.54
BB 0 0.68 0.62 A, AA 67.7 6 4.6 0.68
Case III FF0 0.92 0.90 A, B, C, AB, AA, BB 71.4 6 2.1 20.54
CC 0 0.91 0.89 A, C, AA 69.3 6 2.2 20.49
BB 0 0.79 0.76 A, AA 68.1 6 2.5 20.57
Case IV FF0 0.92 0.90 A, C, AB, AA, BB 71.4 6 1.9 20.59
CC0 0.91 0.89 A, C, AA 70.4 6 2.7 20.45
BB0 0.79 0.75 A, AA 68.1 6 2.6 20.58
*These set points do not agree to FF design. Refer Figure 1A for contour profiler. The predictions are not same, though the design points of CC and
BB are subsets of FF.

threshold and forward method available in JMP stepwise y 5 b0 1 b1 x1 1 b2 x2 1 b3 x3 1b12 x1 x2 1b23 x2 x3


regression fitting platform. The significance of model was
tested using the F-test. The residuals were checked for pres- 1b31 x3 x1 1b11 x21 1b22 x22 1b33 x23 1e (12)
ence of any systematic behavior. To compare the model, pre-
diction contour and prediction profiler were used at different After finding significant b’s, the model may take the
set points of A, B, and C (as defined earlier for the different reduced quadratic form of
case studies).
y 5 b0 1 b1 x1 1 b2 x2 1b12 x1 x2 1b23 x2 x3 1b11 x21 1b22 x22 1e
(13)
Modeling with quadratic terms The model shown for Case I, FF0, is one such example
The typical approach of model fitting (Method 1) for (Table 2). Sometimes, it is preferred to maintain the hierar-
response surface using second-order quadratic equation for chy of terms used (that is to have lower order term if the
three factors is higher order term is significant for the same factor) in the

Figure 1. (A) Case study I: Contour profiler FF0, CC0, BB0. Predictions of CC0 and BB0 are quite different from FF0. (B) Case
study III: Contour profiler FF 0 and CC 0. Prediction of CC 0 is quite different from FF 0. In other cases only one factor
was significant so no contour profiler was possible.
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Biotechnol. Prog., 2014, Vol. 30, No. 1 91

Table 3. Model Summary for Case Study I for the Top Two Models for Each Design Using Method 2
Set Point A
B50
Design R2 R2Adj Significant Model Terms (P < 0.05) AICc BIC ^
Max. Y C 5 20.5
FF A 0.92 0.89 A, B, C, AA, AB, BB, BC, ABC 209.9 209.1 90.1 6 7.6 0.40
FF B* 0.98 0.97 A, B, C, AA, AB, BC, ABC, ACC, AABB, AACC 185.8 179.1 86.3 6 4.1 0.44
CC A 0.95 0.89 A, B, C, AA, AB, BC, CC, ABC 166.1 118.2 84.0 6 11.7 0.31
CC B* 0.99 0.98 A, B, C, AA, AB, BC, ABC, ACC, AABB 171.7 91.5 85.5 6 5.2 0.45
BB A 0.96 0.93 A, B, C, AA, AB, CC 125.5 94.0 88.6 6 10 0.20
BB B* 0.97 0.94 A, B, C, AA, AB, AABB 122.2 90.7 84.7 6 7.8 0.42
DO 16 A 0.97 0.93 A, C, B, AA, AB, CB, BB, ACB 154.1 117.8 93.2 6 10.1 0.42
DO 16 B* 0.99 0.98 A, C, B, AA, AB, CB, ACC, ACB, AABB 155.4 97.9 88.8 6 5.5 0.42
IO 14 A 0.99 0.99 A, B, C, AA, AB, AC, CC, ACC 127.5 60.6 86.4 6 2 0.35
IO 14 B* 0.99 0.98 A, B, C, AA, AB, AABB 105.2 81.6 86.6 6 4.3 0.29
DO 14 A* 0.96 0.93 B, C, AA, AB, CB, ACC 127.5 103.9 86.2 6 10.5 0.03
DO 14 B 0.95 0.89 B, C, AA, AB, CB, ABB, AABB 149.9 110.7 91.2 6 13.4 0.55
*Selected model. **Only for DO 14 prediction is less good than unselected model for the same set of data. Refer Figures 2A,B for contour profile
prediction.

model. This helps when the model is constructed in actual Case Study IV. A good fit is obtained for FF0 and CC0
units (not coded form). We have tested different models with R2 > 0.9. But the set point predicted for maximum
without this restraint. purity of A (20.59) by FF0 is quite different from CC0
The results of Method 1 are summarized in Table 2. The (20.45). In contrast, a better prediction was obtained in BB0
FF models were assumed to the best approximation to the (20.58) but the fit was poor (R2 < 0.8) and significant effects
true behavior as they have the maximum number of observed of parameters B and C was not predicted.
responses. CC and BB were compared to FF. Thus, it can be inferred that it is not possible to get an
Case Study I. All models (FF0, CC0, and BB0) exhib- accurate prediction profile if the number of the runs are
ited desirable P value (<0.05), R2, and R2Adj (>0.8). But the reduced. It was also observed that there were only 5–7 sig-
set point of A (at maximum Y) predicted by FF0 was quite nificant model terms, so 16 runs of CC or 15 runs of BB
different from CC0 and BB0. The contour profilers of CC0 should have been sufficient enough to determine those terms
and BB0 also show different profiles than FF0 (Figure 1A). as we have the sufficient degrees of freedom. Hence in the
Based on the predicted response, it can be observed that next section, we explored a different approach.
models CC0 and BB0 are not reliable as the predicted set
points for maximum response differ for these models from
the reference model, FF0.
Modeling including third-order terms
Case Study II. The model FF0 exhibited desirable P
value (<0.05), R2 (>0.8), and R2Adj . Although the other two Next, we included terms of the form x1 x2 x1 x2, x1 x2 x1
models were significant (P value < 0.05), their R2 were quite and so forth in the model (Method 2). These terms were not
low (0.62). The maximum set point of A (pH) predicted here more than second order for individual factors and so terms
was same with FF0 and CC0 but not for BB0. The compari- such as x31 , x33 and so forth were not included as the levels of
son of contour profiles was not possible as the models were factors taken were three and this is not enough to estimate
missing key effects (B, C). Once again, it can be concluded all the third-order terms. Higher order terms formed in this
that BB0 and CC0 do not offer an appropriate alternative to way were used to estimate the curvature of the response
FF0. curve as they give more degrees of freedom to the model.
These terms cannot be considered a true representation of
Case Study III. A similar situation to that in Case study
the factor’s effect but may be useful in explanation of the
II was observed here as well. FF0 was quite different from
data and may aid in precise prediction of response and in
CC0 and BB0. The contour profile is shown in Figure 1B.
obtaining optimal set point with lesser number of observed

Table 4. Model Summary for Case Study II for the Top Two Models for Each Design Using Method 2
Set Point A
B50
Design R 2
R2Adj Significant Model Terms (P < 0.05) AICc BIC ^
Max. Y C 5 20.5
FF A* 0.97 0.95 A, B, AA, AB, AC, BC, AAB, ABB, AAC, ABC 158.3 153.5 75.2 6 2.6 0.288
FF B 0.99 0.97 A, B, AA, AB, AC, BC, AAB, ABB, AAC, ABC, AABB, AACC 153.3 138.7 75.2 6 1.9 0.308
CC A* 0.96 0.90 A, B, AA, AB, AC, BC, AAB, ABB, AAC 158.4 100.9 76.6 6 6.9 0.277
CC B 0.99 0.97 A, B, AA, AB, AC, BC, AAB, ABB, AAC, ABC, AABB 251.9 80.0 76.3 6 4.2 0.246
BB A 0.99 0.98 A, B, AA, AB, BC, AAB, ABB, AAC 127.1 60.1 74.4 6 2.3 0.354
DO 16 A* 0.96 0.91 A, B, AA, AC, AB, CB, AAC, AAB, ABB 159.4 101.9 77.9 6 7.8 0.292
DO 16 B 0.99 0.98 A, B, AA, AC, AB, CB, AAC, ACC, AAB, ACB, ABB, AABB 480.1 70.9 78.3 6 4.2 0.3
IO 14 A 0.99 0.98 C, B, AA, AB, AC, BC, AAB, AAC, ABC 193.6 68.7 76.5 6 4.2 0.246
IO 14 B* 0.98 0.94 A, B, AA, AB, CB, AAC, AAB, ABB 150.0 83.1 76.4 6 5.98 0.338
DO 14 A 0.97 0.92 A, B, C, AA, AC, BC, AAB, ABC 155.0 88.1 77.9 6 8 0.277
DO 14 B* 0.98 0.96 B, C, AA, AC, BC, AAB, AAC, ABC 145.1 78.2 78.2 6 5.7 0.262
*Selected model.
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
92 Biotechnol. Prog., 2014, Vol. 30, No. 1

Table 5. Model Summary for Case Study III for the Top Two Models for Each Design Using Method 2
Set Point A
B50
Design R2 R2Adj Significant Model Terms (P < 0.05) AICc BIC ^
Max. Y C 5 21
FF A* 0.97 0.95 B, A, BA, AA, AC, BBA, BAA, ACC 155.8 157.3 100.6 6 1.5 0.21
FF B 0.99 0.97 A, C, B, AA, AC, AB, BB, AAC, ACC, AAB, ABB, AABB, AACC 167.7 151.3 97.9 6 2.4 0.27
CC A* 0.98 0.97 A, B, AA, AB, AC, AAB, ABB, AABB 120.6 84.4 100.5 6 2.2 0.22
CC B 0.97 0.95 A, B, AA, AB, AC, AAB, ABB 112.2 89.1 99.9 6 2.4 0.20
BB A 0.97 0.94 A, B, AA, AB, AAB, ACC 104.7 86.3 99.7 6 2.4 0.13
DO 16 A* 0.98 0.96 B, BA, CA, AA, BBA, CCA, BAA 105.4 82.4 100.7 6 2.67 0.17
DO 16 B 0.97 0.95 B, A, BA, CA, AA, BBA, BAA 108.5 85.4 100.7 6 2.92 0.17
IO 14 A* 0.97 0.94 A, B, AA, AB, AC, AAB, ABB 119.7 80.4 100.6 6 3.1 0.19
IO 14 B 0.98 0.97 AA, AB, AC, CC, ACC 85.1 70.9 103.4 6 3 0.15
DO 14 A* 0.99 0.97 A, B, AA, AB, AC, AAB, ABB 109.9 70.6 101.5 6 2.4 0.25
DO 14 B 0.99 0.99 A, B, AA, AB, AC, BC, AAB, ABB, ABC 181.7 56.7 101.46 6 1.8 0.29
*Selected model.

responses. This approach is illustrated in Figure 3 and is prediction profiles, thus making the selection of model a
briefly described below. complex task. To solve this problem, we also used the AICc
A number of models were fitted with different possible and BIC criteria for model selection along with common
terms with significant P-value (<0.05) as discussed above. criteria.
In FF designs, the degree of freedom (DOF) is sufficient to For a set combination of parameters, inclusion/exclusion
determine the parameters for most combinations of terms. of terms for model building is done in iterative way until
But in case of 14–16 run designs this is not possible. So, dif- changes in value of selection criterion becomes insignificant.
ferent models with or without higher order term were made This judgment depends on the dataset and is highly qualita-
using the available DOF. Screening platform built in JMP tive. For the dataset chosen in this work, addition of terms
software was utilized to select combination of terms within to model in most cases decreased the BIC, RMSE, PRESS
the DOF. Further, in some of the cases, we could not esti- RMSE (data not shown), and was found to confuse setting
mate terms like ACC and ABB simultaneously. To avoid the end point of value of criterion. Alternatively, AICc val-
this situation and reduce the effort of checking all possible ues were increasing or decreasing by addition and removal
combinations, Lenth’s t-ratio test (Lenth, 1989) was used to of terms from the model for different cases and so it can be
find out which factors were more important than others. So used as the first major criterion for the model selection. It
if BB was yielding a higher t-ratio value than CC, then ABB can be easily scaled with minimum AICc value from a com-
was selected over ACC. Likewise, AABB was preferred over peting set and thus use of this criterion is more quantitative.
AACC if required. As stated before as a rule of thumb, for two competing mod-
In this way, we got a set of competing models that are els where DAICc value is more than 10, model with lower
more likely to contain the best model. From them top two AICc can be selected (Criterion 1). For most of the models,
models were selected and are shown in Tables (3–6). The this is the case but if this value is less than 10 then only this
competing models were compared for each set of data and criterion is not sufficient for selection of model. Based on
not across different datasets. As the designs were generated the analysis of the predicted responses at different set points
for standard model of RSM, including additional terms in and the contour profiler, the next selection criteria (BIC,
the model increases the correlation amongst the coefficients PRESS, R2, and R2Adj ) were compared. It was observed that
(Figure 2) and their variance. This is an indicator that we the difference between R2 and R2Adj (Criterion 2) was most
cannot use estimates of a parameter for measuring its inde- consistent and logical to use. If there was no difference
pendent behavior. Also, there will be competing models hav- between R2 and R2Adj then the model with fewer terms was
ing similar P-value, R2, R2Adj , and RMSE but different preferred (Criterion 3) as it is always better to have a

Table 6. Model Summary for Case Study IV for the Top Two Models for Each Design Using Method 2
Set Point A
B50
Design R 2
R2Adj Significant Model Terms (P < 0.05) AICc BIC ^
Max. Y C 5 21
FF A* 0.97 0.96 A, C, AA, AB, BB, ABB 118.3 122.0 70.1 6 1.4 20.327
FF B 0.99 0.98 A, C, AA, AC, AB, CB, BB, ACB, ABB, ACBB, AACB 120.7 114.2 70.6 6 1.1 20.481
CC A* 0.98 0.97 A, C, AA, AB, BB, ABB 83.1 68.7 69.9 6 1.7 20.308
CC B 0.98 0.97 A, C, AA, AB, BB, ACC 83.1 68.7 71.8 6 1.6 20.585
BB A 0.99 0.99 A, C, AA, AC, AB, CB, BB, ABB 92.9 45.0 70.9 6 1.2 20.508
DO 16 A* 0.98 0.97 A, C, AA, AB, BB, ABB 83.5 69.1 68.6 6 2.4 20.308
DO 16 B 0.97 0.95 A, AA, CC, AB, BB, ACC, CBB 101.7 78.7 72.6 6 3.1 20.6
IO 14 A* 0.98 0.96 A, C, AA, AB, BB, ABB 85.0 61.4 69.5 6 2.3 20.246
IO 14 B 0.99 0.98 A, C, B, AA, AB, CB, BB, AAB, ABB 173.6 48.6 69.4 6 1.7 20.215
DO 14 A* 0.99 0.99 A, C, AA, AB, BB, ABB 70.5 46.9 70.2 6 1.5 20.369
DO 14 B 0.99 0.99 A, C, B, AA, CC, AB, BB, AAB, ABB 155.6 30.7 70 6 1.1 20.523
*Selected model.
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
93

Figure 2. Correlation map of standard CC, Case I CC 0 and CC B. There was an increase in the correlation of main factor effect of

Flow diagram illustrating the approach that has been proposed for optimal selection of a model.
Biotechnol. Prog., 2014, Vol. 30, No. 1

Figure 3.
CC B.
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
94 Biotechnol. Prog., 2014, Vol. 30, No. 1

Figure 4. (A) Case study I: Selected model contour profiles for FFB, CC B, BB B, DO16 B, IO14 B, and DO 14 A designs. All model
predictions are close to reference model except DO 14 A. (B) Case study I: Unselected model contour profiles for FFA, CC
A, BB A, DO16 A, IO14 A, and DO 14 B designs. The profiles are not similar to that for the reference model.
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Biotechnol. Prog., 2014, Vol. 30, No. 1 95

Figure 5. (A) Case study II: Selected model contour profiles for FF A, CC A, BB A, DO16 A, IO14 B, and DO 14 B designs. All model
predictions are close to the reference model except DO 14 A. (B) Case study II: Unselected model contour profiles for FF B,
CC B, DO16 B, IO14 A, and DO 14 A designs. Counter profiles in Figure 4A offer closer prediction of the reference model
than these models.
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
96 Biotechnol. Prog., 2014, Vol. 30, No. 1

Figure 6. (A) Case study III A: Selected model contour profiles for FFA, CC A, BB A, DO16 A, IO14 A, and DO 14 A designs. All
model predictions are close to the reference model. (B) Case study III A: Unselected model contour profiles for FF B, CC
B, DO16 B, IO14 B, and DO 14 B designs. As seen in Table 5, the difference amongst the contour profiles is minimal.
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License

Figure 7. (A) Case study IV: Selected model contour profiles for FFA, CC A, BB A, DO16 A, IO14 A, and DO 14 A designs. All
97

model predictions are close to the reference model. (B) Case study IV: Unselected model contour profiles for FFB, CC B,
DO16 B, IO14 B, and DO 14 B designs. Profiles are quite different from each other.
Biotechnol. Prog., 2014, Vol. 30, No. 1
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
98 Biotechnol. Prog., 2014, Vol. 30, No. 1

parsimonious model. Finally, if the number of terms were (Figures 1, 4–7) were also very similar to the reference model.
same it is proposed that the model with higher R2 be DO16 and BB were the next best for Case study I but only DO
selected (Criterion 4). 16 was close for Case studies II and III. Contour pattern was
Among forward, backward, and combined step wise closer to the reference model for BB. It indicates that though
regression procedure available in JMP, backward elimination BB may not be good at estimating exact set points, it can predict
procedure was found to be more suitable. As the main aim the pattern with more accuracy. Given that it requires 14–15
was inclusion of higher order terms, it is desirable to test the runs, it may be preferred over DO 16. Similarly, IO 14 contour
models with those terms included to begin with. It should be profiler pattern was better than DO 14 but not for estimating set
mentioned that proper care is not taken for inclusion of cor- points. Maximum response Y prediction was not significantly
rect combination of terms for competing models as sug- different for all the cases but as the number of runs decreased,
gested above, stepwise regression procedure for small dataset error variance increased. Other models with 12–18 runs were
can yield unexpected values such as R2 5 1 and a large nega- also examined and the results obtained were consistent with the
tive AICc. above findings (data not shown). As expected, variance and
unpredictability increase with decrease in the numbers of runs.
Case Study I. The number of terms was higher for FF B
but its AICc was lesser (DAICc < 10) compared to FF A. One major drawback of above method of model construc-
Therefore, the reference model (FF B) for this case was eas- tion is that it leads to correlation among the estimated
ily selected using Criterion 1. Other models that were parameters (Figure 2) and thus the true nature of the model
selected based on this criterion were IO14B and DO14A. parameters cannot be known. To determine the parameters
DAICc for CC and DO 16 models were less than 10 but dif- values without correlation, follow up experiments can be car-
ferences between R2 and R2Adj were less and so Criterion 2 ried out by augmenting the existing design. The new designs
was applied. In the case of BB, neither Criteria 1 or 2 were points can be obtained by either optimal design algorithm
applicable and so Criterion 3 was used and BB B was pre- (available in JMP) and only for the parameters terms (such
ferred as its R2 was better than BB A (Criterion 4). as AAB or AABB) that were found to be significant in the
final model.
Case Study II. The selection of reference model (FF A)
was based on Criterion 3 as the number of terms was less.
Criteria 1 or 2 were not applicable as DAICc < 10 and dif- Conclusions
ference between R2 and R2Adj were same for FF A and FF B.
Rest of the models for this case study were selected based Through this work, we have shown application of com-
on Criterion 1 (DAICc > 10). puter generated designs for bioprocess applications and also
compared them with standard and more commonly used
Case Study III. The selection of reference model (FF A)
designs. It can be concluded that for the applications exam-
and DO 14 A was based on Criterion 1 (DAICc < 10). Crite-
ined in this article, IO designs (which include CC) are better
rion 2 was used for CC and Criterion 3 for DO models,
than either DO or BB designs in making predictions as the
respectively. For IO 14, Criterion 3 was applicable as IO 14
criteria are based on the least possible average variance of
A was a simpler model consisting of first-order term com-
prediction for a given set of runs. BB designs are better than
pared to second and higher order terms in IO 14 B.
other DO designs if the numbers of runs are of same order.
Case Study IV. As discussed above, Criterion 1 was Approach of fitting model with higher order term gave
applicable for DO and IO designs and Criterion 3 (less number more accurate results. Further use of the various criteria for
of terms) for reference model (FF A). For CC design, none of competing model selection has been demonstrated. The two
the criteria was applicable. It was seen that all values were statistical selection criteria AICc along with R2 and R2Adj
same except one extra term in ACC versus ABB. This exam- proved sufficient for the datasets that were analyzed in this
ple illustrated a situation where one combination of parame- study. It is also clear that model selection based only on
ters can be preferred over other to develop a competing model model P-value and R2 may not lead to accurate prediction.
set. As discussed before, Lenth’s t-ratio can be used. In this
Based on this dataset, model design, and fitting, we can
case, the Lenth’s t-ratio for BB is higher than CC, so ACC is
say that it is possible to obtain precise prediction of set
not a preferred term over ABB for competing model.
points and response with as many as half the number of runs
than what is typically performed. The practical use of DOE
Comparison of different models may prove to be more valuable for prediction if the proposed
To check the accuracy of the above selected models, we approach is used.
compared the predicted value of the competitive models at
various set points of B and C and determined set points of A Acknowledgment
as A was most important factor in all the case studies. Maxi-
mum response set point is shown in Tables (2–6). All were The authors are thankful to the High Impact Research Pro-
in excellent agreement with the selected model (except Case posal Grant from IIT Delhi that contributed towards this pro-
I DO 14). Further, to see the overall response curve, contour ject. The authors would like to state that there is no conflict
profiles were plotted between factors A-B, B-C, and C-A. of interest.
All were better than the models created using Method 1
(Figures 1A,B). The contour profiles between factors A-B
Literature Cited
for different case studies are shown in Figures 3–6. The con-
tour profiles between factors B-C and C-A are not shown. 1. Montgomery DC. Design and Analysis of Experiments. Wiley;
New York, 2008.
When maximum response set point was compared with 2. Box GEP, Hunter JS, Hunter WG. Statistics for Experimenters:
the reference model FF, it was seen that CC offered Design, Innovation, and Discovery, Vol. 13. Wiley; New York,
the best prediction for all the cases. Its contour profiles 2005.
15206033, 2014, 1, Downloaded from https://aiche.onlinelibrary.wiley.com/doi/10.1002/btpr.1821 by INDIAN INSTITUTE OF TECHNOLOGY (NON -EAL), Wiley Online Library on [11/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Biotechnol. Prog., 2014, Vol. 30, No. 1 99

3. Plackett RL, Burman JP. The design of optimum multifactorial 22. Lee CMS. Constrained optimal designs. J Stat Plan Inference.
experiments. Biometrika. 1946;33(4):305–325. 1988;18(3):377–389.
4. Persad A, Chopda V, Rathore A, Gomes J. Comparative per- 23. Hibbert DB. Experimental design in chromatography: a tutorial
formance of decoupled input–output linearizing controller and review. J Chromatogr B. 2012;910:2–13.
linear interpolation PID controller: enhancing biomass and etha- 24. Mandenius CF, Brundin A. Bioprocess optimization using
nol production in saccharomyces cerevisiae. Appl Biochem Bio- design-of-experiments methodology. Biotechnol Prog. 2008;
technol. 2013;169(4):1219–1240. 24(6):1191–1203.
5. Box GEP, Meyer RD. An analysis for unreplicated fractional 25. Kalil S, Maugeri F, Rodrigues M. Response surface analysis
factorials. Technometrics. 1986;28(1):11–18. and simulation as a tool for bioprocess design and optimization.
6. Dejaegher B, Vander Heyden Y. The use of experimental design Process Biochem. 2000;35(6):539–550.
in separation science. Acta Chromatogr. 2009;21(2):161–201. 26. Janson JC, Hedman P. On the optimization of process chroma-
7. Boyle DM, Buckley JJ, Johnson GV, Rathore A, Gustafson ME. tography of proteins. Biotechnol Prog. 2008;3(1):9–13.
Use of the design-of-experiments approach for the development 27. Rathore AS, Mhatre R. Quality by Design for Biopharmaceuti-
of a refolding technology for progenipoietin-1, a recombinant cals: Principles and Case Studies. New Jersey: Wiley-Inter-
human cytokine fusion protein from Escherichia coli inclusion science; 2011.
bodies. Biotechnol Appl Biochem. 2009;54(2):85–92. 28. Harms J, Wang X, Kim T, Yang X, Rathore AS. Defining pro-
8. van Hoek P, Harms J, Wang X, Rathore AS. Case study on defi- cess design space for biotech products: case study of Pichia pas-
nition of process design space for a microbial fermentation step. toris fermentation. Biotechnol Prog. 2008;24(3);655–662.
In: Rathore AS, Mhatre R, editor. Quality by Design for Bio- 29. Rathore AS, Bhambure R, Krull IS. High-throughput tools and
pharmaceuticals: Principles and Case Studies. New Jersey: approaches for development of process chromatography steps.
Wiley Interscience; 2009:85–109. LC GC N Am 2011;29:252.
9. Booth KHV, Cox D. Some systematic supersaturated designs. 30. Bhambure R, Rathore A. Chromatography process development
Technometrics. 962;4(4):489–495. in the quality by design paradigm I: establishing a high-
10. Box GEP, Wilson K. On the experimental attainment of opti- throughput process development platform as a tool for estimat-
mum conditions. J R Stat Soc Series B Stat Methodol. 1951; ing “characterization space” for an ion exchange chromatogra-
13(1):1–45. phy step. Biotechnol Prog. 2013;29:403–414.
11. Myers RH, Montgomery DC, Anderson-Cook CM. Response 31. Muthukumar S, Rathore AS. High throughput process develop-
Surface Methodology: Process and Product Optimization using ment (HTPD) platform for membrane chromatography. J Membr
Designed Experiments, Vol. 705. New Jersey: Wiley; 2009. Sci. 2013;442:245–253.
12. Ferreira SLC, Bruns RE, da Silva EGP, dos Santos WNL, 32. Cook RD, Nachtrheim CJ. A comparison of algorithms for construct-
Quintella CM, David JM, de Andrade JB, Breitkreitz MC, ing exact D-optimal designs. Technometrics. 1980;22(3):315–324.
Jardim ICSF, Neto BB. Statistical designs and response surface 33. Meyer RK, Nachtsheim CJ. The coordinate-exchange algorithm
techniques for the optimization of chromatographic systems. J for constructing exact optimal experimental designs. Technomet-
Chromatogr A. 2007;1158(1):2–14. rics. 1995;37(1):60–69.
13. Box GEP, Behnken D. Some new three level designs for the study 34. Atkinson AC, Donev AN, Tobias RD. Optimum Experimental
of quantitative variables. Technometrics. 1960;2(4):455–475. Designs, with SAS. New York: Oxford University Press; 2007.
14. Box GEP, Draper NR. A basis for the selection of a response 35. Burnham KP, Anderson DR. Multimodel inference understand-
surface design. J Am Stat Assoc. 1959;54(287):622–654. ing AIC and BIC in model selection. Sociol Methods Res. 2004;
15. Box GEP. The exploration and exploitation of response surfa- 33(2):261–304.
ces: some general considerations and examples. Biometrics. 36. Montgomery DC, Borror CM, Stanley JD. Some cautions in the
1954;10(1):16–60. use of Plackett-Burman designs. Qual Eng. 1997;10(2):371–381.
16. Baş D, Boyacı IH._ Modeling and optimization I: usability of 37. Akaike H. Information theory and an extension of the maximum
response surface methodology. J Food Eng. 2007;78(3):836–845. likelihood principle. Springer Series in Statistics. Springer, New
17. Rathore A, Sharma C, Persad AA. Use of computational fluid York, 1992, 610–624.
dynamics as a tool for establishing process design space for 38. Hurvich CM, Tsai CL. Regression and time series model selec-
mixing in a bioreactor. Biotechnol Prog. 2012;28(2):382–391. tion in small samples. Biometrika. 1989;76(2):297–307.
18. Bade PD, Kotu SP, Rathore AS. Optimization of a refolding 39. Sugiura N. Further analysts of the data by akaike’s information
step for a therapeutic fusion protein in the quality by design criterion and the finite corrections: further analysts of the data
(QbD) paradigm. J Sep Sci. 2012;35:3160–3169. by akaike’s. Commun Stat Theory Methods. 1978;7(1):13–26.
19. Silvey SD, Silvey SD. Optimal Design: An Introduction to the Theory 40. Schwarz G. Estimating the dimension of a model. Ann Stat.
for Parameter Estimation. London: Chapman and Hall; 1980. 1978;6(2):461–464.
20. Pukelsheim F. Optimal Design of Experiments, Vol. 50. Society 41. Neter J, Wasserman W, Kutner MH. Applied Linear Statistical
for Industrial and Applied Mathematics; Philadelphia, 2006. Models, Vol. 4. Chicago: Irwin; 1996.
21. de Aguiar PF, Bourguignon B, Khots M, Massart D, Phan-
Than-Luu R. D-optimal designs. Chemometr Intell Lab Syst. Manuscript received Apr. 23, 2013, and revision received Sept. 26,
1995;30(2):199–210. 2013.

You might also like