Batch operations benefit from process analytical technology

A successful application of batch analytics for online operation, resulting in a solution validated for a specialty chemicals batch process. The discussion illustrates basics of batch analytics operation, including access via a Web interface.

By Robert Wojewodka, Terry Blevins, and Willy Wojsznis May 22, 2011

Process analytical technology (PAT), viewed broadly, includes chemical, physical, microbiological, statistical, and risk analysis conducted in an integrated manner used broadly in batch manufacturing. In particular, multivariate statistical modeling (data analytics) serves as an excellent tool designed for batch process analysis. The online implementation of analytic technology comprises fault detection and end of batch quality prediction. The approach offers significant economic benefits; however, implementations face many challenges and so far few have documented online applications.

For batch analytics online to be successful, the follow key areas should be addressed:

  • Process holdups: Operator- and event-initiated processing halts and restarts. Sometimes halts and restarts are part of the batch process design, such as adding a special ingredient. Other times progression of a batch may be delayed by limitations imposed by the need to wait for common equipment to become available. Holdups data must be accounted for during analytic model development and in the online application of analytics.
  • Access to lab data: Due to the nature of batch processing, online measurement of quality parameters may not be technically feasible or economically justified. Thus, it is common that at various points in the batch a grab sample is taken and analyzed in the lab. To implement online analytics, it is necessary that lab results be available for model development and validation.
  • Variations in feedstock: The charge to a batch may come from storage tanks that are periodically refilled by upstream processes, or by truck or rail shipment from outside suppliers. Changes in incoming raw material properties directly impact batch operation and quality parameters and should be available for online analytic tools.
  • Varying operating conditions: The processing conditions may vary significantly with each batch operation. The batch process should be split on operations performed in similar conditions (stages), and an analytic model should be developed for every stage.
  • Concurrent batches: Multiple batches of the same product may be executing at various stages of completion.
  • Assembly and organization of the data: One of the limiting items which often prevents detailed analysis of batch processes is the inability to access, correctly sequence, and organize a data set of all necessary data. This requirement must be fulfilled to analyze the process and to move the results of that analysis online.
  • Data alignment from different batches: Batch durations are not equal. For developing an analytics model, data from the various batches should be aligned, forming data with an equal number of data samples for every batch.

An effective solution to meet batch challenges can be found by applying current developments and research in batch modeling, process control systems, and Web technology, and then by considering an implementation strategy based on close cooperation of analytics system developers and end users.

Basics of analytics

Analytics may be effectively built on multivariate statistical methods which have been known from the beginning of the previous century. Personal computers made practical use of those methods possible in many applications, and in particular, batch analysis. Several accompanied techniques, primary for data unfolding and data alignment, have also been developed for batch analysis implementation. A summary of batch analytics techniques is presented below.

Data alignment

Time required to complete one or more operations associated with a batch may vary because of process holdups or processing conditions. (See Figure 1.) However, the batch data used in model development must be somehow aligned in order to facilitate data analysis.

To achieve uniform batch length, the data at a certain time in the batch could be simply chopped off, compressed, or expanded in some fashion to achieve the same number of time increments. Better results may be achieved by applying a newer technique known as dynamic time warping (DTW). DTW aligns batch data with the reference trajectories by minimizing total distance between them. The batch with median time duration can be used as an initial DTW reference batch. The DTW principle for aligning one trajectory is illustrated in Figure 2. To satisfy the minimal distance between trajectories, two or more points on the red trajectory (A) can be transformed into one point on the green aligned trajectory, or one point is stretched on many points (B).

Data unfolding

The aligned model data file is a three-dimensional array: I batches, J variables, and K scan periods. (See Figure 3.) Prior to the model development, the data file is unfolded into two dimensions, IKxJ. Hybrid unfolding is a recent improvement over commonly used batch-wise and variable-wise unfolding. With hybrid unfolding, mean values and variances are calculated for every time period for batch-wise unfolded data. (See Figure 3, arrow A.) The data is then rearranged as variable-wise unfolded (see Figure 3, arrow B.); therefore, there is no need to assume arbitrary trajectories from the current time until the end of the stage, as with the original batch-wise unfolding.

Multivariate statistical methods

Processes with correlated relationships require use of multivariate statistical techniques. Figure 4 shows that while trends for both correlated parameters are within their respective process control limits, the process operation is still deemed faulty (i.e., the relationship between parameters is broken) for the identified observation, and the fault is detected only by the multivariate statistics.

A primary multivariate statistical method often used is principal component analysis (PCA). At the heart of PCA technology is the concept that a time-based profile for measurement values may be established using a variety of batches that produced good quality product and had no abnormal processing upsets. The model may be used to develop a better understanding of how multivariate parameters relate to one another and how all factors can impact batch-to-batch costs.

The model structure takes into account that many of the measurements used in the batch operation are related to each other and respond in a similar manner to a process input change. In other words, they are collinear. For such conditions, all process variations can be modeled by the primary principal components. (Matrix T in Figure 3.) The PCA model may be used to identify process and measurement faults that may impact product quality.

The modeled part of the process variation is also captured by the Hoteling’s T2 statistic. Uncorrelated variations not included into the principal components are not modeled. They are presented by what is known as the Q statistic. In this way all process variations can be reflected by two indicators. (See Figure 5.) Two bar plots at the bottom display show how particular process parameters contribute to the process variations. For diagnosing a selected parameter, operators may look at a parameter trend plotted along with a reference trend and the acceptable parameter band of variation. (See Figure 6.)

Through the use of these two statistics, it is possible to determine fault conditions in the batch sooner and thus allow investigations and corrections to be made to counter the impact of the fault.

Projection to latent structures (PLS, also known as partial least squares), is applied to analyze the impact of processing conditions on final-product quality, and can provide operators with continuous predictions of end-of-batch quality parameters. In some batches, where the objective is to classify the operation results into discrete categories (e.g., fault category, grades, etc.), combining another multivariate statistical method known as discriminate analysis (PLS-DA) would be used in conjunction with PCA and PLS.

Online analytics allows for the operator to monitor a batch operation simply by using a plot of the PCA statistics and the PLS estimated end of batch quality, as illustrated in Figure 6.

Once PCA and PLS models have been developed using data from normal batches, their performance in detecting faults and predicting variations in end-of-batch quality parameters may be tested by replaying data collected from abnormal batches.

Analytics in specialty chemicals batch application

An architecture was developed to apply these advanced statistical analysis methods outlined in this paper for batch processing in such a way that it integrates with both an enterprise resource planning system as well as a process control system. (See Figure 8.) This application in the specialty chemical industry contains many of the batch components commonly found in many chemical processing companies. However, as with any engineering endeavor, the success of the project depends greatly on the steps taken in applying this analytic technology. To address this application, a multidiscipline team was formed that includes the toolset provider, as well as expertise from Lubrizol’s plant operations, statistics, MIS/IT, and engineering staff.

The major steps of a successful project may be summarized as follows.

  • Collecting process information: When applying data analytics to a batch process, it is important to have a good understanding of the process, the products produced, and the organization of the batch control. Thus, a multidiscipline team should be created for a project like this. A list of the process measurements, lab analysis, and truck data for raw material shipment should be created, forming what may be called an input-process-output data matrix.
  • Instrumentation and control survey: A basic assumption in the analytics application to a batch process is that the process operation is repeatable. An instrumentation and control survey, and good loop tuning are important factors in satisfying this requirement.
  • Integration of lab data: Key quality parameters associated with the batch operation at the plant should be obtained for lab analysis by obtaining grab samples. The lab analysis results should then be entered into the data system. The properties analysis for raw material shipments should also be entered into the data system. To allow this data to be used in online analytics, an interface is needed between the ERP system and the process control system.
  • Historian collection: Modeling and test data should be in an uncompressed format.
  • Model development: The tools for model development must allow for easy selection of data from the data historian and to organize a subset of the data associated with parameters used for the model development. The model must be validated on conformance with several model quality indexes and additionally by using fast playback of test data.
  • Training: Operator and plant engineering training is a vital part of commissioning any analytics application.
  • Evaluation: User feedback and data collected on improvements in process operation are valuable to evaluate the analytics application.

Basic findings from using this form of process analytics approach have been positive and include:

  • The engagement of operators and engineers who provide positive feedback on the analytics used and accept this new tool for fault detection and quality prediction;
  • The accumulation of learning as use of the installation continues after the preliminary field trials;
  • The ability to exploit new functionality that can detect and diagnose process, instrumentation, and operational problems;
  • The importance of using stages in analytic modeling;
  • The advantages of Web-based online user interface; and
  • The usefulness of Web-based process simulation for operator training.


Using multivariate process analytics motivates people to think in entirely new ways and address process improvement and operations with a better understanding of the process. It allows operational personnel to identify and make better informed corrections before the end-of-batch, and plays a major role in ensuring that batches repeatedly hit predefined end-of-batch targets. Additionally, the engineers and other operations personnel gain further insight into the relationships between process variables and their importance on product quality.

Robert Wojewodka is process improvement team leader and statistician for the Lubrizol Corporation. Terry Blevins is principal technologist, future architecture, and Willy Wojsznis is senior technologist for Emerson Process Management.

For more information, visit:

References and additional reading:

    Robert L. Mason and John C. Young. Multivariate Statistical Process Control with Industrial Applications. Statistics and applied probability. ASA-SIAM, Philadelphia, 2002.
    Boudreau M. A. and McMillan G. K., New Directions in Bioprocess Modeling and Control. ISA, Research Triangle Park, NC, 2006.
    Kassidas, A., MacGregor J. F., Taylor, P. A., Synchronization of Batch Trajectories Using Dynamic Time Warping, AIChE Journal, 44, April 1998, No. 4.
    Lee, J. M., Yoo, C. K., Lee, I. B., Enhanced process monitoring of fed-batch penicillin cultivation using time-varying and multivariate statistical analysis, Journal of Biotechnology, 110, 2004, 119-136.
    Blevins T. and Beall J. Monitoring and Control Tools for Implementing PAT, Pharmaceutical Technology, March 2007 (supplement).
    Robert Wojewodka, Terry Blevins, and Willy Wojsznis, Batch Process Analytics, 2010 Emerson Global Users Exchange, San Antonio, TX, September 2010, audio recording: