Bioprocess Design of Experiments (DoE)

Bioprocess Design of Experiments (DoE)

In this article we delve into a Bioprocess Design of Experiments or “DoE”, a core technology for the organization of bioprocess development. Arguably, DoE is one of the most valuable tools we have in process development and for manufacturing problem solving. DoE has shaped the way we think as scientists and engineers about doing experiments. As scientists in bioprocess development, we always have limited resources to investigate, explore or optimize bioprocesses. DoE is one of our primary tools to use our resources most efficiently. DoE brings structure and efficiency to process development. This saves companies a lot of money. And, interestingly, DoE also brought great organizational change to organizations that adopted this technology.

Read here about how biotech organizations adopt DoE in various stages of the bioprocess lifecycle. Read about how DoE triggers organizational change and a re-thinking of our bioprocess development practices. But adopting DoEs is not always straight forward: Be aware of common pitfalls and follow DoE best practices for bioprocess development.

What is design of experiments (DoE)?

Bioprocess development is all about investigating, characterizing and optimizing bioprocesses. Data analytics covers all development stages and unit operations. We need to find out how process parameters impact on quality and performance. We want to find optimal operating points. But how can we do this most efficiently? Scientists found out that random experimentation is not the most efficient way to optimize and investigate bioprocesses. We need a structured and organized methodology based on sound statistics. The solution is Design of Experiments, or DoE. Based on a predefined objective, DoE uses a statistical algorithm to design the most efficient experiments. The statistics behind are well understood for years, rocked proof. No doubt, using DoE to optimize, investigate and characterize a bioprocess is scientifically sound and the method of choice of regulators in biopharma. Some key terminology you need to know:

Factors. Those parameters you want to investigate. For example, temperature. pH. Flow rates. Media compositions. For biopharmaceutical process development, factors are often potential critical process parameters (pCPPs).

Responses. Those process outcomes you are interested in. For example, product amount. Glycosylation pattern. Quality attributes. Time space yield. For biopharmaceutical process development, responses are often critical quality attributes (CQAs)

Today, we use DoE in almost every stage of the bioprocess lifecycle.

Bioprocess Design of Experiments (DoE) for Early Bioprocess Development

Biotech companies ask themselves the question whether to adopt DoE early on in process development or in later stages only. Our experience is, that your bioprocess development success will benefit from DoEs from early on. Choose screening designs that enable to investigate many parameters. Make sure to adopt best practices for bioprocess DoEs and read more here, otherwise experimental plans can fail and you enter in resource intensive iteration cycles.

Bioprocess Design of Experiments (DoE) for Process Optimization

Clearly, optimization is the specialty event of DoE. Your objective is clear: Optimization. You select your factors, ranges, go for response surface modelling and find your process optima. Simple, right? Be case: There are pitfalls in DoE for process optimization. How to prevent and handle edge of failure events in DoEs? Which is the most efficient design for my purpose? Do I properly consider the variance in my responses? Make sure to adopt best practices for bioprocess DoEs and read more here, otherwise experimental plans can fail and you enter in resource intensive iteration cycles.

Bioprocess Design of Experiments (DoE) for Process Characterization Studies

Arguably the most important application of DoE. Arguably the most challenging one. Process characterization, part of stage 1 process validation. A must-have exercise to bring your biopharmaceutical product to the market. You do DoEs to almost every unit operation. You do this to calculate normal operating ranges (NOR) and proven acceptable ranges (PARs). Your NORs and PARs go directly to the batch records for your future manufacturing. The way you do your process characterization events and calculation of NORs and PARs has a direct impact on the number of out of spec events and deviations in the future commercial manufacturing. High stakes!

Make sure you only start with good and well qualified scale down models. Read more about scale down model qualification here.

You do DoEs not for optimization; you do DoEs to characterize your bioprocesses. Make sure you use appropriate designs. Read more about best practices for bioprocess DoEs here.

You want to proof that a parameter has no effect, you want to evaluate its criticality. Therefore, your data analysis procedure of DoEs is different. Read about more about criticality analytics in a recent article of Exputec and Boehringer Ingelheim here.

Design of Experiments (DoE) creates structured experimental data – and paves the way for digital twins

Digital twins are “digital replica of physical assets (physical twin), processes and systems that can be used for various purposes.[1]” Recently, digital twins also found their way into bioprocessing. Foundation to create digital twins of your bioprocesses is good experimental data (from process development) and a handful of manufacturing runs. Read more on a case study on a biopharma digital twin that was used for process validation here.

Bioprocess Design of Experiments – Checklist

DoE is easy, right? Your choose your objective. You select your factors, ranges, go for a good design. Simple, right? Take care! There are pitfalls. How to prevent and handle edge of failure events in DoEs? Which is the most efficient design for my purpose? Do I properly consider the variance in my resposes? Make sure to adopt best practices for bioprocess DoEs and read more here. To help you, here our checklist:

What do you want to learn?

  • Do we want to detect only main effects (if they are present)?
  • Do we want to know also interaction and quadratic effects?
  • Do we want to obtain a design that has the minimum variance in our effect estimates? (so called D-optimality)
  • Do we want to obtain a design that has minimal aliasing of interesting model terms with other potentially occurring terms? (so called A-optimality)

What is the objective of your study?

  • Understand effects? that may impact onto CQAs to an extent that is harmful to the patient (mitigate critical effect). Establish NOR
  • Process Development: Optimization of some response (math. reduce variance of all effects “D optimality”)

What are the resources that I can attribute to the study?

  • How many experiments can you perform?
  • Timeframe?
  • What is your failed experiments rate?
  • Do you think you will have to repeat experiments?

Responses you want to Investigate?

  • Which responses do you want to study? (usually known)
  • What is the expected variance on the responses?

Do you know everything you need about the process parameters you want to investigate?

  • Which process parameters do you want to study?
  • Did you perform a risk assessment to identify them?
  • Can you control all parameters you want to investigate independently?
  • Do you encounter hard to change factors?

Ranges of process parameters?

  • How did you define the proposed process parameter ranges for investigation?
  • Did you consider investigated range in your risk assessment?
  • Do you know about a “edge of failure” at certain process parameter conditions?


  • Can you preclude certain effects (interaction or quadratic)?
  • Is prior data available from development or similar/platform process?

Select Design

According to application different designs might be chosen. Designs can be divided by following general engineering problems:

  • Screening: Understand the process and rank studied factors from important to unimportant factors. In terms of process validation this can be applied to criticality assessment and classification of process parameters into critical and non-critical process parameters
  • Modelling: Obtain a highly accurate prediction model (model with lowest prediction error). Usually those designs require more experiments than screening designs.
  • Optimization: Identify optimal conditions to run the process

Screening designs are usually standard designs (full factorial, fractional factorial, Box-Behnken, Definitive Screening Designs). Those designs have limitations and are sometimes not optimal for a given number of experiments. In this case also optimal designs (e.g. D-optimal) can be chosen.

Design Evaluation

To evaluate quality of designs, check following:

  • Cofounding by correlation and alias analysis
  • Utility
  • A priori Power
  • Number of Experiments

Check Cofounding by:

  • Check correlation of factors with their interaction and quadratic terms. Use correlation matrix for that analysis.
  • Check alias structure of the model. Effects that cannot be modelled however possibly exist (e.g. two-factor interaction, quartic effects or even three-factor interaction) might bias the estimate of the factors that can be modelled (e.g. main effects). This can be investigated using the alias matrix. Ideally the alias matrix has only “0”s as entries.

Conduct Experiments

Consider following issues when conducting the experiments:

  • Randomization
  • Record actual factor values not only design values

DoE Analysis

Analysis of the conducted experiments is required to identify critical and non-critical process parameters.

Analysis of statistically impacting factors

Variable selection is performed to identify variables that are significantly impacting on the response (i.e. for those factors the null hypothesis that their effect equals 0 can be rejected). Usually stepwise variable selection using forward and backward elimination as well as all subset selection techniques, where models of all possible variable combinations are analyzed. The criteria for selecting models need to take a tradeoff between model fit (usually R²) and model complexity into account. After identifying a statistical “optimal” model their practical significance must be challenged by process experts. Variables which are contained in a statistical and practical relevant model can be classified into critical and non-critical parameters using their effect size (e.g. >20% of the overall tolerable effect).

Analysis of potentially overlooked factors

Retrospective power analysis can be conducted to identify if a critical effect has been overlooked. In case low power values have been obtained for a parameter (i.e. a critical effect of this parameter is likely to be overlooked) additional experiments can be suggested or adaption.