Bioprocess data creates value for biotech companies. No doubt. Accelerated process development. Investigation & troubleshooting. Operations excellence. Assess optimization potential. Just to name a few use cases. Industry 4.0. Digitalization of the processing environment. New sensor technologies and data integrity requirements disrupt how process data is managed and analyzed. As a result, requirements to data management and bioprocess data analytics changes dramatically. Today, data sources are very complex and multidimensional. Spectra from spectroscopic sensors, such as NIR and Raman, images from flow cytometry and in-line microscopes, mass spectrometry chromatograms, time series sensor data, quality data, software sensors. Upstream. Downstream. Just to name a few. New solutions capable of managing and analyzing all process relevant data types are necessary.
Process data is now multidimensional and of multiple origins
Industry 4.0 and process digitalization brought birth to new process analytical technologies. Processes are better monitored than ever before. The technological possibilities to mine chemical-, physical- and biological data from manufacturing are cheaper and more accurate than ever before. A great quantity of big process data is collected on a routine basis. But data size is not the issue. Data sources are now multidimensional and need to be align holistically, contextualized for doing value added bioprocess data analytics. This is an issue. For integrated bioprocess data analytics, we need to wrangle all relevant data sources. The images below give a brief overview of data types you encounter in bioprocessing.
The new data sources and novel sensor technologies require new ways to analyze data. So far, nobody can analyze spectral data, sensor data and quality data simultaneously. Data is in multiple database systems. Data needs to be combined and restructured for the analysis. This is data wrangling.
The most common way to wrangle data today is in spreadsheets. Data is imported from various historians, time series databases, LIMS or MES systems. Then an engineer or process data analysts sits for weeks wrangling the data. As a result, he has a nice spreadsheet. A spreadsheet, nobody else can reuse. No data integrity is given. No traceability of the process is provided, significant data sets, from 2D and 3D data are still left out. Today engineers and process analysts spend most of their time wrangling with spreadsheets. For more complex data sources, it is even impossible to use spreadsheets.
“Era of Industry 4.0. Analytics of process data using spreadsheets is not feasible anymore”
But how to deal with the new wave of complex data sources? How to wrangle spectral, time-series, picture and document information? How to mine information and bring this data in combined form for dash boarding, analytics and reporting?
Analyze spectral, image and time series process data in one step
The solution is integrated bioprocess data analytics. Integrated bioprocess data analytics takes data from time series databases, spectral databases, MES systems and LIMS systems. Data is wrangled to be available for bioprocess data analytics. Unique to integrated bioprocess data analytics is the capability to mine features (so called Process Performance Indicators = PPI) from the data sources. For example, time series data is condensed to single values of PPIs with high meaning, such as mean, median, average and standard deviations.
New way for bioprocess data analytics – integrated data analytics
The same we do for document/ image information. Algorithms mine PPIs from images. We extract features such as roughness, compactness, saturation. The same happens for spectral or chromatography data, in which not only peak heights bus also 1st and 2nd derivatives and tailing features. Algorithms mine PPIs from spectral data. This happens in a feature mining engine dedicated to mining PPIs from complex datasets. As a result, you have process performance indicators from your datasets.
Based on process performance indicators, bioprocess data analytics is easy. You can dashboard your PPIs to perform technology transfers, scale-up, process characterization and continued process verification. You can provide visual analytics for engineers and scientists or let data scientists access this information to run predictive and machine learning algorithms.
inCyght software for integrated bioprocess data analytics
inCyght software is the first software platform dedicated to integrated process analytics based on holistic PPI analysis. Wrangle and analyze all process relevant data sources. The unique feature mining engine enables to process complex data to easily accessible PPIs. Dashboard and multivariate analytics functionality. Analyze all your data. Fast. Intuitive. In Real time.
Request a personal demo