# Statistical and mechanistic bioprocess model? #### Highlights

• What is a mechanistic bioprocess model?
• How do mechanistic bioprocess models differentiate from a statistical (DoE-) model?
• What are industrial relevant applications for mechanistic bioprocess models?
• Computational environments to develop and run mechanistic- and statistical bioprocess models?

As bioprocess professional, you frequently hear and read about mechanistic bioprocess models. You keep wondering what relevance mechanistic models have for your bioprocess development and manufacturing activities.

• What is a mechanistic bioprocess model and how does it differentiate from a statistical (DoE-) model?
• Applications of mechanistic models in bioprocessing?
• Computational environments for bioprocesses to develop and run mechanistic models

#### Statistical (DoE) models and mechanistic bioprocess models?

Process development and process characterizations are all about investigating the relationship between (critical) process parameters (CPPs) and critical quality attributes (CQAs) and key performance indicators. You do this, to develop process understanding. You require to demonstrate process understanding for your stage 1 process validation, which you develop during your process characterization studies. And, of course, to run your future manufacturing process at optimal and robust manufacturing conditions.

It is industrial best practice for process development to use a toolset of smart experimentation, bioprocess data analytics and mathematical modelling. To mathematical modelling, there are two fundamental approaches. Statistical (DoE-based) modelling and mechanistic modelling. Both are important and have their areas of application. It is important to select the right modelling approach for your unique process development and manufacturing challenge.

#### What is the fundamental difference between mechanistic- and statistical (DoE) models?

But first, let us take a look at the fundamental differences between statistical and mechanistic modelling. In general, modelling is all about describing the relationship between process parameters and quality and performance attributes. Typical statistical (DoE) based models describe the relationships in the following way:

Equation 1: Simple multilinear regression model

CQA = K1+CPP1 + K2*CPP2 + α

• CQA is a critical quality attribute
• CPP is a critical process parameter
• K1 and K2 are coefficients of the multi-linear regression model
• α is the intercept

The model here is a multi linear regression model (MLR). You typically use MLR models following the DoE approach.

Note, the coefficients K1, K2 have no biological or technical meaning. Statistical models aim at finding a model that best describes the data.

In contrast, mechanistic models use mathematical expressions that best describe the biological- or technical processes that take place. For example:

Equation 2: Differential equation describing microbial growth in batch cultures.

dX/dt = µ *X

• X is the cell mass concentration (g / L)
• t is the time (h)
• µ is the specific growth rate the microorganisms (h-1),

Equation 3: Monod equation which relates microbial growth rates in an aqueous environment to the concentration of a limiting nutrient.

Here, all expressions in the model have a biological meaning and can be cross-checked by literature.

µ = µmax*S/(Ks+S)

• μ is the specific growth rate of the microorganisms
• μmax is the maximum specific growth rate of the microorganisms
• S is the concentration of the limiting substrate for growth
• Ks is the “half-velocity constant”—the value of S when μ/μmax = 0.5

To sum up, the fundamental difference between statistical and mechanistic models is the following:

Statistical models use mathematical expressions to best describe the data. They show coefficients without technical meaning. Mechanistic models use mathematical expressions that best describe the physical or biological process. Coefficients have a technical meaning and can even be checked (or taken) from literature (or independent measurements).

#### Mechanistic models have such nice properties (in theory). Why don’t we not use them all the time in industrial practice?

In bioprocess development of biologics, one of the most important tasks is to describe the relationship between (critical-) process parameters and (critical) quality and performance attributes. For biologics processes, you need to find out how your process parameters impact on CQAs like glycosylation, relative potencies, impurities, aggregates and many more.

Imagine you investigate how cultivation temperature impacts on glycosylation of your drug product. A change of temperature impacts on thousands of reactions inside the cell. Some of them lead to change in the glycosylation of your product. How can you approach this in industrial practice using mechanistic models? Unfortunately, you practically cannot (yet). The system is too complex for our current understanding. So, because we do not understand the system, we have to use statistical models instead. This is why for the investigation of CPP/ CQA relationships, the DoE approach followed by statistical modelling is so widely applied in industry. That’s also why statistics plays such an important role in bioprocess characterizations.

So, it is challenging to predict changes of CQAs using mechanistic models. But where can mechanistic models be used for bioprocess development and manufacturing?

#### Soft sensors for real-time bioprocess monitoring

Mechanistic models are the basis for industrial bioprocess soft sensors. Soft sensors are used to predict process variables (like biomass concentration, viable cell density, glucose, lactate concentration etc.) in real time using a mathematical model. And in most cases, this model is a mechanistic model. Soft sensors for microbial and cell culture processes are available. Soft sensors are used in industry to substitute (expensive) hardware sensors. You can read more on soft sensors in our article “What is a soft sensor?”.

#### Simulation for Bioprocess Design

You can apply mechanistic models for the design and optimization of bioprocesses. Product formation, growth, viability, oxygen transfer rates can be simulated well using mechanistic models. You can simulate in silico how feeding profiles would impact on oxygen demand, when you would run into oxygen limitation and which feeding profile is optimal to maximize productivity. You can also simulate how your process would perform in single use equipment, where you have more narrow mass transfer limitations compared to stainless steel. This approach is used complementary to DoE based process design:

• Initial process design is conducted using simulations (mechanistic models)
• DoE methodology is used for refinement and investigation of CPP/CQA relationships

#### Process Control for Optimization

You can run the model in real time to control and any variable the model estimates. This is used in industrial biotechnology. This means in addition that you can control on an objective function, such as maximum titer or maximum time space yield. For example you want to get to a maximum of the product concentration. The model estimates this concentration, although you are not measuring it actually. Now you can for example alter the feedrate of a limiting substrate along with the optimization procedure to maximize product concentration.

#### Computational environments for bioprocesses to develop and run mathematical models? Figure 2: inCyght® environment for bioprocess data management and analytics. Data from bioprocess devices, external databases (LIMS/ ELNs, historians) and semi-structured spreadsheet data is aligned in the inCyght® database. Easy to use graphical data analytics tools and powerful Python and R based computational modelling environments enable to perform complex (real-time) operations on your bioprocess data.

To run bioprocess models in real-time, you require a real-time capable computing environment. Using inCyght® bioprocess software , bioprocess laboratories and manufacturers develop cutting edge mechanistic and statistical models and predictive control algorithms.

The inCyght® real-time environment follows a server/ client architecture. inCyght is run on a central server for data management and real-time computation. This enables data management, analytics and real-time computing using all historical and real-time data. Connections to lab equipment are established using OPC and ODBC interfaces. Scientists access using the web-browser for process monitoring, to develop new models using the web-based R and Python IDE. Operators access by remote or in the lab for bioprocess monitoring and real-time analytics.

In case you are interested in realizing real-time computational modelling environments for bioprocesses: Take a look at inCyght® bioprocess software or directly contact us at contact@exputec.com.