Statistical computing in manufacturing through historians

Statistical software packages allow high-level analysis of production data by connecting through historians.

By Dr. Holger Amort, Maverick Technologies June 7, 2016

Manufacturing facilities are creating large amounts of real time data that are often historized for visualization, analytic and reporting. The amount of data is staggering and some companies spend numerous resources for the upkeep of the soft and hardware and data securement. The driving force behind all these efforts is the common belief that data is valuable and contains the information for future process improvements.

The problem has been that advanced data analytics requires tools that go beyond the capabilities of spreadsheet programs such as Microsoft Excel, which is still the preferred option for calculation in manufacturing. The alternatives are software packages that specialize in statistical computation and differ in price, capability and steepness of the learning curve.

One of these solutions, the R program, has become increasingly popular and is supported by Microsoft [1][2][3][4]. R has been open source from the beginning and it is freely available, which has drawn a wide community to use it as their primary statistical tool. There are other reasons why it will also be very successful in manufacturing data analytics:

  1. R works with .NET: There are two projects that allow interoperability between R and Net called R.NET[5] and RCLR[6].
  2. R provides a huge number of R packages (6,789 on June 18th, 2015), which are function libraries with specific focus. The package ‘qcc’ [7], for example, is an excellent library for univariate and multivariate process control.
  3. According to the 2015 Rexer[8] Data Miner Survey, 76% of analytic professionals use R and 36% use it as their primary analysis tool, which makes R by far the most used analytical tool.
  4. Visual Studio now supports R with support for debugging and Intellisense. Visual Studio is a very popular Integrated Development Environment (IDE) for NET programmers and will make it easier for developers to start programming in R.
  5. R’s large user base helps to review and validate packages.
  6. The large number of users in academia leads to the fast release of cutting edge algorithms.

Below are two examples of using R analysis in combination with the OSIsoft PI historian (+ Asset and Event Framework).

Example 1: Process Capabilities 

Example 2: Principal Component Analysis of Batch Temperature Profiles 

The results of the R Analysis can also be used in real time for process analysis. In general, the process of model development and deployment is structured as follows:

In the model development phase, models such as SPC, MPC, PCA or PLS are developed, validated and finally stored in a data file. During the real time application or model deployment phase, new data are sent to R and the same model is used for prediction. 

There is an increasing gap in manufacturing between the amount of data stored and the level of analysis being performed. The R statistical software package can close that gap by providing high level analysis of production data that are provided by historians such as OSIsoft PI. It provides a rich library of statistical packages that perform univariate and multivariate analysis and allows real time analytics.

This post was written by Dr. Holger Amort. Holger is a senior consultant at Maverick Technologies, a leading automation solutions provider offering industrial automation, strategic manufacturing, and enterprise integration services for the process industries. Maverick delivers expertise and consulting in a wide variety of areas including industrial automation controls, distributed control systems, manufacturing execution systems, operational strategy, business process optimization and more.











Maverick Technologies is a CSIA member as of 6/7/2016.