Get the latest updates on the Coronavirus impact on engineers.Click Here
Data Acquisition, DAQ

How to enhance data acquisition best practices

As technology evolves and becomes more sophisticated, data acquisition models need to change, as well.

By Brian E. Bolton July 28, 2020
Figure 3: An example of a template used for facility generators. Courtesy: MAVERICK Technologies

The future of data availability is coming into focus for many industrial facilities. More companies are leveraging a variety of smart manufacturing tools and technologies, like the Industrial Internet of Things (IIoT) and edge computing, to digitally transform operations and make real-time, data-driven decisions. These technologies are increasing much faster than anticipated to a point where we are seeing more cloud-connect-type data sharing. Data now can be transmitted or received from vendors, distributors, suppliers, customers and more.

To keep pace in this ever-increasing Big Data environment, establishing data acquisition best practices is extremely important but so is the need to continuously update and improve the data models. As data streams change over time, it is necessary to constantly check to see if anything new has been added. This quality control approach ensures facility personnel identify, capture and analyze the right data to successfully improve operational productivity, efficiency, agility and flexibility.

Since the 2019 Applied Automation article Eight data acquisition best practices was published, the ways in which businesses acquire data has become more creative. The previous article covers how data comes from multiple sources and identifies the essential best practices required to manage the data. Today, the number of data acquisition software programs continues to grow as businesses use the various forms of programming languages (e.g., Python) to develop data acquisition software programs to help capture mission critical data. As an update to the article, the key data acquisition system best practices are reviewed here with additional concepts on how to enhance best practices already in place and manage real-time data to stay competitive.

Data at the Edge

Edge-computing technology is gaining traction in industry as it becomes more affordable. Obtaining data from the edge is becoming increasingly important, making it a key data acquisition best practice. As an update, the following data acquisition system components now include edge data:

  • Sensors: to convert physical parameters to electrical signals
  • Signal conditioning circuitry: to convert sensor signals into a form that can be converted to digital values
  • Analog-to-digital converters: to convert conditioned sensor signals to digital values
  • Edge data: data delivery in the form of a message via Rest API or data storage devices. Data is created in a messaging format and transferred via cloud services or a web application programming interface (API).

Some applications need computing power and access to data immediately. Edge computing streamlines the flow of traffic from IIoT devices for real-time data analysis. Data from sensors in the field is written to edge devices and then written to the edge infrastructure. From the edge infrastructure, the data is replicated to the centralized data center (typically in the cloud) at low roundtrip speeds of 5 to 10 milliseconds.

Figure 1: A simple data acquisition system network connection. Courtesy: MAVERICK Technologies

Figure 1: A simple data acquisition system network connection. Courtesy: MAVERICK Technologies

The advantage of collecting data from the edge brings information from remote areas of the business to the heart of the data collection system at nearly real-time speed. Having as much data as possible available for decision making will keep businesses competitive.

Automation’s core

In addition to edge technology, more devices with embedded historians are being implemented. For example, while skid-based assets are in use, they can collect data and send it to the primary historian. Data historians (e.g., OSIsoft PI, AspenTech’s Aspen InfoPlus.21 and Rockwell Automation’s FactoryTalk Historian) are used to acquire and store selected data from instrumentation and control system sources that are at the heart of automation processes.

For this reason, understanding and keeping up to date on these core instrumentation and control sources is an important data acquisition best practice, especially as newer technologies such as edge devices evolve. Depending on a facility’s data acquisition requirements, these sources can capture, generate, organize and manage data that will be valuable to the business using data analytic tools. Currently, the most common industrial instrumentation and control systems, platforms and devices include:

  • Supervisory control and data acquisition (SCADA): Used to view/monitor/control your process variable data, while providing a graphical representation of the process via human-machine interface (HMI) displays
  • Programmable logic controllers (PLCs): Handles data up to about 3,000 input/output (I/O) points
  • Distributed control systems (DCSs): Handles data when the I/O point count is greater than 3,000
  • Manufacturing execution systems (MESs)/manufacturing operations management (MOM): Helps control warehouse inventory; packaged raw materials, packaging material and parts
  • Enterprise resource planning (ERP) systems: Captures administrative data; load times, equipment utilization, personnel availability, orders and raw material availability
  • Edge devices: Queries and stores remote data; lighting, weather sensors, pump and motor details and electrical energy usage.

Interfaces and cloud connectors

Several different types of interfaces and cloud services are available to collect and store data. Understanding the various interface nodes is the next data acquisition best practice. To ensure process data is obtained from the data acquisition control systems and written to the data historians, the most commonly used standard interface types include:

  • OLE for Process Control (OPC): a software interface standard that allows Windows programs to communicate with industrial hardware devices
  • OLE for Process Control Data Access (OPC-DA): eliminates the need for custom drivers/connectors to communicate with the various sources
  • OLE for Process Control Historical Data Access (OPC-HDA): used to retrieve and analyze historical process data for multiple purposes, optimization, inventory control and regulatory compliance to name a few
  • Universal File and Stream Loading (UFL): reads ASCII data sources and writes the data to the PI data historian.

In addition to these standard connections and interfaces, industries are also working with three different types of cloud service models:

  • Software as a Service (SaaS): a software distribution model where third-party providers host applications and make them available to customers over the internet. Some examples include Google Apps, Salesforce, Dropbox, DocuSign and Slack, to name a few.
  • Platform as a Service (PaaS) or application platform as a service (aPaaS): a type of cloud-computing offering where service providers deliver a platform that enables clients to develop, run and manage business applications without having to maintain the infrastructure such software development processes normally require. Examples of PaaS are AWS Elastic Beanstalk, Windows Azure, Apache Stratos, Force.com (SalesForce) and Google App Engine.
  • Infrastructure as a Service (IaaS): is a service model that delivers a computer infrastructure on an outsourced basis to support enterprise operations. IaaS provides hardware, storage, servers and data center space or network components. Examples of IaaS are DigitalOcean, Microsoft Azure, Amazon Web Services (AWS), Rackspace and Google Compute Engine (GCE).

These services are commonly referred to as the “cloud computing stack.” IaaS is on the bottom of the stack; PaaS is in the middle and SaaS is on top. Data collected via cloud services can be securely transferred from one data source to another via cloud connectors. This works very well when multiple locations need to collect data on their own servers and share across the enterprise.

Secure data systems

With today’s need for increased data security, a high availability system provides as much redundancy and data loss protection as possible. Figure 1 is an example of how a simple data acquisition system network connection is designed. In comparison, Figure 2 illustrates how a more complex high availability data acquisition system is designed.

Figure 2: A complex, high availability data acquisition system network connection. Courtesy: MAVERICK Technologies

Figure 2: A complex, high availability data acquisition system network connection. Courtesy: MAVERICK Technologies

The high availability system is set up to provide failover protection. If the primary server fails, the secondary server continues to collect data. Notifications are used to inform the right personnel that there is a problem. When the problem is resolved, the server will be ready to take over in the event of another issue. This setup is exceptionally good for facilities that rarely have downtime. Routine software updates or even version upgrades can be done without having to upset production.

Buffer, backup/archive and scan

As buffering, data backup/archiving and scan class are each part of the eight data acquisition best practices, it’s important to review and understand them:

  • Buffering is an interface node’s ability to access and temporarily store the collected interface data and forward it to the appropriate historian. To effectively perform data acquisition, it is recommended that buffering is enabled on the interface nodes. Otherwise, if the interface node stops communicating with the historian, the collected data is lost. Some industries have strict guidelines regarding data integrity. Implementing buffering capabilities at the PLC or DCS level may be required to eliminate or minimize data loss due to network connectivity failures. Improvements to prevent data loss such as in the event of power failures are currently underway. For example, administrative controls can be implemented to meet data collection system limitations to ensure process data integrity is met.
  • Data backups are used to restore data in case it is lost, corrupted or destroyed. Backup strategies are key for protecting current/immediate data. Protocol documentation is critical to backing up and restoring data when things do not go as planned. Data archives protect older/historical information that is not needed for everyday business operations but is occasionally needed for various business decisions. Data archiving is the practice of moving data that is no longer being used to a separate storage device. Data archives are indexed and have search capabilities to aid in locating and retrieving files.
  • Historian interfaces use a code called a “scan class” to scan tags at different time intervals and schedule data collection. Scan classes determine a period of time in hours, minutes and seconds that tells the historian how often to collect the data. Knowing the data to be collected is essential to setting up the scan class. Data for things like temperature, level, pressure and flow will need a faster scan rate. Data for starting a pump or opening a valve may only need to be written when the state changes. Properly setting up the scan classes will ensure systems run as efficiently as possible.

Data organization and metadata

Today, several companies are finding ways to logically organize data as part of their data acquisition best practices. The most used component of OSIsoft’s PI Server, for example, is asset framework (AF), which makes organizing and sharing data much easier. It integrates, contextualizes, refines, references and further analyzes data from multiple sources and even external relational databases. AF allows the user to create a hierarchy of elements/assets and all their attributes including metadata.

Using visualization tools and AF, the end user can now experience data from sources other than the historian. Element relative templates can be used to significantly reduce the number of displays needed for similar assets. For example, tanks, pumps, motors, agitators or generators can have a single graphic/display template. The placeholders for specific data related to the asset will be populated based on the asset selected. Figure 3 is an example of a template used for facility generators. Note the data that comes directly from the historian, as well as metadata that comes from maintenance tracking sources or platforms like MES or ERP.

Figure 3: An example of a template used for facility generators. Courtesy: MAVERICK Technologies

Figure 3: An example of a template used for facility generators. Courtesy: MAVERICK Technologies

“Metadata” is a set of data that describes and gives information about other data. Using software-coded connectors, access to data from all types of data sources is possible. Having the ability to link metadata to assets provides some unique ways to collect, analyze, visualize and report on process conditions.

Looking ahead

As technology evolves and becomes more sophisticated, data acquisition models need to change as well. Whether you have been capturing automation data for a long time or are just starting out, trying to make sense out of the data acquisition best practices can be a challenge. In instances where resource bandwidth is an issue, consider consulting a third-party automation solutions provider or system integrator to assist in designing, building, sustaining or improving your next data acquisition project. As the data availability path becomes clearer to manufacturers, leveraging new technology such as edge computing and cloud-based services will help enhance data acquisition best practices to gain a competitive advantage.

This article appears in the Applied Automation supplement for Control Engineering and Plant Engineering.

Maverick Technologies is a certified member of the Control System Integrators Association (CSIA) and a CFE Media content partner.


Brian E. Bolton
Author Bio: Brian E. Bolton (brian.bolton@mavtechglobal.com) is a consultant for MAVERICK Technologies, a CFE Media content partner. He has more than 35 years of experience in chemical manufacturing, including more than 20 years involved with the OSIsoft PI Suite of applications, quality assurance, continuous improvement and data analysis. Maverick Technologies is a member of the Control System Integrators Association (CSIA).