Five data acquisition strategies for success

Did you just see something extraordinary in the recorded data or was it a data anomaly in measurement, communications, or data aggregation? Data collection should help us learn, not confuse. Heed these five strategies for better industrial data.

By Allen Tubbs and Benjamin Menz, Bosch Rexroth Corp. October 5, 2018

What did I just see? Everyone has had that thought when seeing something extraordinary or unbelievable. It might be a video we see online or a sports replay or maybe something in real life that makes us say, "Wait…. What?!"

This isn’t the question to be asking about data being collected or the results coming from data algorithms. Data collection systems should help with learning, not amaze and confuse.

Some simple practices are needed to make Industrial Internet of Things (IIoT) systems offer something valuable. There seems to be a myth that says if enough data is collected for a system smart enough to process it all, usable results will be created. But is that really the case? If terabytes of historical data points and a machine-learning (ML) algorithm are deployed, will all the answers we’ve been looking for come true? Probably not. 

Correlation vs. causation

Causation cannot be derived just because a data set correlates to another. Actionable data is the goal. Data that helps the return on investment (ROI) will pay for the data collection system many times over. Thankfully, it isn’t hard. Some common-sense solutions will suffice. 

1. Frequency of data

Data collection rates correlate to dollars; there isn’t much debate about that. Sending data to the cloud is easy to understand. Mbps, data message counts, and terabytes cost dollars. Even with an on-premise solution, a network that can support the data collection and storage for data processing is still needed. Both of these translate to cost.

It makes sense to ask "How often do I need an update?" Recording ambient environmental conditions on a millisecond level is probably overkill, but collecting vibration data once a second won’t give useable data either. Data must be collected at a rate that translates to representative data sets. Specific application requirements will point to the correct sampling rate.

Bearing manufacturers, for example, will publish specific frequencies to monitor based on the physical properties of the bearing itself. 

2. Accuracy of data

Sensor accuracy is important to collecting data that represents the real-world conditions. Is it necessary to know the ambient humidity to a 1/100th of a percent or the actuator position to 1 micron? If a temperature sensor has an accuracy of +/- 5° C, is that good enough? The data collected needs to represent the application correctly to draw accurate conclusions.

When measuring part dimensions to check quality, the sensor needs to exceed the tolerance accuracy of the part being measured. If ambient temperature may affect the process, an accuracy proportional to the expected temperature fluctuation may be needed to know if it has an influence or not. Some knowledge of an application will help in selecting sensors that are accurate and economical for the data collection system. 

3. Resolution

Closely related to accuracy, data resolution is related to how well a recording device can read the sensor data. Data point size, for example, can make a difference between usable data and junk. The sensor might be accurately detecting the data, but if the controller can’t read it at that accuracy over the entire fluctation range, it doesn’t matter. The controller needs to read data at the accuracy needed over the full range of anticipated values. Application knowledge will answer this and help find the correct controller resolution necessary to provide useful data. 

4. Synchronized data

Some data points collected might need to be tightly synchronized to other data points. This can be especially important in high-speed data collection. Consider a monitored vibration value that corresponds to the position of an actuator. It might be good to know the position of the actuator that corresponds to a specific out-of-tolerance vibration reading during a machine cycle.

For this measurement to be accurate, the position and vibration data must be collected so they correspond to each other in a way that reflects the actual behavior in time. This could be accomplished by having one controller read both values at the same time, or by time-syncing two controllers so the time-stamps of both sets of data are synchronized.

In this case, if data sets cannot accurately be correlated to one another in a repeatable fashion, an accurate measurement of system behavior won’t be possible. 

5. Application knowledge

There’s a theme in the points above. Application knowledge is the key to collecting the right data and turning it into actionable data. From the examples above, it’s easy to imagine different ways to collect data that can lead to misunderstandings. But even if data is accurate and represents the real-world conditions, is it relevant? 

Consult with experts

It would be very rare for one person or group to understand the inner workings of every component of a machine from a design perspective. More likely, technologies such as hydraulics, pneumatics, electric motors, and actuators are combined to create one machine. Simply applying sensors to collect historical data on the machine may not get results that make sense. Consulting with experts on those components can help get to an answer faster and provide help to know what data to look for and how to look for it. With a little thought, some simple data collection practices and some application knowledge, the data system can start producing usable insights.

Allen Tubbs is product manager, and Benjamin Menz is data scientist; both with Bosch Rexroth Corp. Edited by Mark T. Hoske, content manager, Control Engineering, CFE Media,

KEYWORDS: Data acquisition, data quality

Data quality relies on frequency, accuracy, and resolution.

Data synchronization helps with data quality.

Application knowledge can help data quality.


Is your "Aha" moment based on actual or erroneous data?