Benefits of using a process historian

Process historians are complex pieces of software that are used to store and analyze vital process and industrial data and offer several benefits such as prebuilt analysis equations and compatibility with other industrial software packages.

By Mina Andrawos February 19, 2016

Process historians fall under a category of their own in the world of industrial software due to the critical role they play in the success of analysis and decision making. Process historians are complex pieces of software that are used to store and analyze vital process and industrial data.

For example, if equipment is heating more than usual on the factory floor, the user will need to store the equipment’s temperature readings. This allows the user to investigate whether the temperature is rising over time or not and by how much. With this kind of visibility, the user can replace equipment right on time before it fails. A process historian is ideal for this kind of situation. It is designed to store data, analysis, data visualization, exposure of the data to the application program interface (API), and even the alarm notification if the production is configured with the appropriate license. 

Process historian data storage

Process historians are specialized types of databases called time series databases. A time series database attaches a timestamp to every new piece of data it receives and then stores the pieces of data in the order they were received. Time series databases typically don’t need to form complex relations between different data points when they store them. In other words a time series database is optimal for retrieving a piece of data that changes over a period of time. Process historians fall under the NOSQL database category because they are not relational databases in nature.

A relational database, on the other hand, which is the most common database engine, stores data in tables with rows and columns. The user defines what the columns should be and what the rows should be. No timestamps are involved unless they are put in the database. The user defines the complex relationships between these tables and how a change in one table can affect other tables and so on.

This is too much clutter for a time series database. Most of the heavy algorithms that go in relational databases to make rows, columns, and table relations efficient become a burden if all the user needs to do is to store a piece of data with a timestamp. A typical example for relational databases is Microsoft SQL server or MySQL. 

Process historian features

Process historians have several features that can benefit users including:

  1. Process historians have prebuilt analysis equations, data visualization icons, and data sheets that are like figures from a process engineering textbook. They have a variety of options for efficiency equations, power equations, steam table charts, industrial equipment icons ready for use, and other features that are very relevant in the industrial world.
  2. They are very compatible with industrial software packages typically used in process control, such as human-machine interfaces (HMIs), distributed control systems (DCSs), and other drivers and controllers.
  3. Process historians use specialized algorithms to compress data and save disk space. For example, if the user has a value 1 at time1 and a value of 1.0001 at time2, in most cases the user doesn’t need to store 1.0001 because it won’t affect the analysis much. Over time, that saves a lot of disk space and resources, and the compression can be disabled if it isn’t needed.
  4. Process historians often come prepackaged with "interfaces," which are separate pieces of software that can be deployed to the field to closely monitor the small sensors and controllers while the historian sits at the data center or the cloud. It isn’t practical to install the process historian at each sensor since it is a very heavy piece of software. Instead, the user should install the interface, which is light and can communicate with the sensor or the controller before relaying the data to the central historian.
  5. Store and forward is vital for process historians because a missing piece of data can result in an incorrect analysis, which can lead to a wrong decision with dire consequences. What store and forward guarantees is that data will not be lost even if the central historian loses the connection with the remote interface. The remote interface will detect that the historian is not taking data and then it will start storing the data it collects in an internal local buffer. Once the connection to the historian opens back up, the interface will forward this data up to the historian.
  6. Process historians usually cache recent data directly in computer memory before it’s permanently stored on the hard drive. This is very efficient for analysis and calculations performed on newer data, which is usually used to detect any sudden surprises in production before they become big problems. 

Process historians and open source software

Process historians currently reside very comfortably in the closed source world with relatively high price tags. They rely very heavily on the Microsoft stack; clouds are in Azure; scripts are in PowerShell; web portals are in Silverlight; and software development kits (SDKs) are VC++ or .NET. They are starting to move towards HTML5, but it is still in the early stages.

The open source world, however, offers multiple options for time series databases currently which can be made into process historians if the right investment and energy are put into it. The barrier isn’t trivial though, and the industrial world didn’t care enough so far to invest in that sort of thing.

Two open-source features that can offer great value to process historians are sharding and distributed data processing.

Sharding is the process of distributing the data load on multiple server nodes while keeping track of where the data went. Sharding uses some specialized algorithms made to ensure that when a client requests data, the algorithm helps to sort out which node hosts that piece of data and then provides it to the client. Sharding is essential for enormous data loads to ensure it will not break a single server. Without sharding, scalability becomes a big pain as the organization using the data grows and expands.

Distributed data processing is a must-have feature for organizations that are extreme data crunchers such as Google or Amazon. The principles of distributed data processing are to divide very heavy calculations to smaller calculations, have them executed in distributed server nodes, and then get the results and join them together via a lighter calculation. This technique makes the power of any analytical engine virtually limitless.

Process historians can empower a company’s success and their longevity if they are properly understood and utilized regardless of the industry.

Mina Andrawos is a staff engineer with Bloom Energy’s SCADA team. Andrawos’s work involves writing software that tightly integrates with process historians, HMIs, SCADA security, device drivers, and SCADA software backends. Edited by Chris Vavra, production editor, CFE Media, Control Engineering,

ONLINE extra

See additional stories about process historians linked below.