Four ways spreadsheets limit data analytics

Tools needed for data cleansing, visualization, contextualization, and modeling.

04/30/2018


Figure 1: Time is a key element when evaluating process data, so to align information signal readings must often be reformatted to align information signal readings for use in spreadsheets. Courtesy: Seeq Corp.Process industry firms have collected manufacturing data for decades. With each step-change advance in hardware and software, organizations generate and collect more data, characterizing process conditions, supply-chain metrics, and other production aspects.

Nevertheless, companies struggle to convert collected data volumes into useful information and insights. They do so to improve reliability, safety, and profitability of process units, plants, and businesses. But as data volumes grow, the challenges intensify.

An industrial revolution, driven by the Industrial Internet of Things (IIoT), is unfolding based on advanced computerization, sensor proliferation, and wireless technologies-dramatically expanding data types and volumes to store and analyze, and requiring a better analytics approach.

Historically, process manufacturers use spreadsheets to organize data collected in tabular form. Originally meant for accounting and finance, spreadsheets were never a great match for large volumes of time-series data. They did, however, allow software-enabled formula building, as well as calculations across multiple sheets.

Therefore, engineers adopted spreadsheets for data analytics projects, resulting in labor- and time-intensive processes. In addition, using spreadsheets, results sharing and collaboration with others was difficult. As companies amassed more data, they struggled to find efficient ways to share data-driven insights within the organization.

Advanced analytics software is the means to overcome these challenges and barriers. To better understand these advances, let's look at four spreadsheet limitations, as well as how each is addressed by analytics solutions. 

Volumes of data

Process manufacturing and monitoring systems produce massive data amounts that collectively characterize process conditions, operation/product flows, and equipment condition. Data related to control systems are generated in varying forms. The general approach is to assemble all data related to an investigation into a spreadsheet, and then do the analysis. The sheer volume of collected data from multiple sources quickly erodes capabilities to conduct effective analyses.

Before doing analytics, data must be sorted and cleansed, and the number of data points in the spreadsheet reduced. Instrumentation signals are reformatted to fit the spreadsheet column/rows paradigm, as illustrated in Figure 1. The stated limit for a Microsoft Excel spreadsheet is about one million rows. A common process-system sensor sampling frequency is once per minute, which equates to a half million rows in Excel per year. If the sampling frequency is once every 30 seconds, or if the user wants to review two years of data, then it is impossible to look at all the data in proper resolution.

In addition, files that stretch the limits of spreadsheet capacity will experience performance issues. Layering in multiple sets of data and calculations, having numerous large files open at once, and linking to other applications and macros hinder spreadsheet usability. All these capabilities are commonly required by an engineer's or scientist's process data workflow. With spreadsheets, users must make concessions on the type and sampling of data segments. 

Data isolation

Figure 2: Identifying and sharing data-driven insights derived from spreadsheet analysis is a labor- and time-intensive process. Courtesy: Seeq Corp.While related to volume limits, data isolation is a separate issue. For example, each time a team member accesses process data, they first download it into a separate and duplicative file. This is a one-time snapshot extraction. If the data changes or updates, then the query must be redone. This can have ramifications for subsequent calculations, cleansing, and insights. Large files are difficult to share across an organization and keep in sync, especially if multiple users are viewing the same data sets and sources.

Given IIoT and the cloud, creation of more and larger databases is a continuing trend. In addition, not all of the data, databases, and users are in one location. Remote databases and users further complicate the task of getting the proper data to users.

Once the relevant data is assembled in a spreadsheet, how do users find data-driven insights? Engineers are most interested in how data behaves over time and in relation to other system elements. For example, temperature, pressure, feedstock quality, and conversion rate all traverse time and have processing relationships.

As in any analysis, the user must first identify the process points of high interest, such as optimum steady-state conditions, critical-equipment vibration trends, shutdowns, emissions events, and other parameters. Time is a factor for each. Engineers analyze data aggregated across shifts, weeks, months, or years to identify trends and root causes.

To do this in a spreadsheet, users sort columns and rows to identify data points to consider. This sorting/cleansing is done systematically with spreadsheet functions, but 70% of the top 10 most-used functions Microsoft lists for Excel are for data-wrangling and not data analytics, which is where value is delivered.

Data manipulation comprises from 50% to 90% of the time spent developing spreadsheet applications, as illustrated in Figure 2. Spreadsheet algorithms can sort and slice data, but data manipulations/calculations approaches are not transparent, and they can be difficult to remember and share with colleagues.

For instance, in a monthly unit report or a quarterly emissions assessment, the data must be re-queried, and any manual elements must be reproduced or automated with macros. If the analysis is done infrequently, or by a different person, then it can take significant time to learn or re-learn the spreadsheet data machinations. Some teams have separate documentation to describe workflows, but the lack of transparency in developing macros hinders reproduction of any analysis. 


<< First < Previous Page 1 Page 2 Next > Last >>

Engineers' Choice Awards
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by Control Engineering subscribers.
System Integrator Giants
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
System Integrator of the Year
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
How to Maximize Factory Automation Efficiency with Low Cost Machine Vision
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Wireless Reliability in Harsh Environments
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
Human Factors and the Impact on Plant Safety
This eGuide contains a series of articles and videos that considers theoretical and practical; immediate needs and a look into the future.
July 2018
Ladder logic best practices and object-oriented programming, safety instrumented systems, enclosure design issues and challenges, process control advice
June 2018
Discrete and process sensor fundamentals, autotuning controls, system integrator roundtable
May 2018
Salary and Career Survey, IT and OT convergence, robotic standards and safety, secure circuit protection
Edge Computing
This article collection contains several articles on how today's technologies heap benefits onto an edge-computing architecture such as faster computing, better networking, more memory, smarter analytics, cloud-based intelligence, and lower costs.
Data Center Design
Data centers, data closets, edge and cloud computing, co-location facilities, and similar topics are among the fastest-changing in the industry.
PLCs
Programmable logic controllers (PLCs) represent the logic (decision) part of the control loop of sense, decide, and actuate. Featured articles in this digital report compare PLCs and programmable automation controllers (PACs), industrial PCs, and robotic controllers.
SIDB

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

June 2018
Machine learning, produced water benefits, progressive cavity pumps
April 2018
ROVs, rigs, and the real time; wellsite valve manifolds; AI on a chip; analytics use for pipelines
February 2018
Focus on power systems, process safety, electrical and power systems, edge computing in the oil & gas industry
John O. Ayuk, PE, CFSE, PMP, CAP
Automation Engineer; Wood Group
Doug Baker
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
Data Centers: Impacts of Climate and Cooling Technology
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
Safety First: Arc Flash 101
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
Critical Power: Hospital Electrical Systems
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
Engineers' Choice Awards
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by Control Engineering subscribers.
System Integrator Giants
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
System Integrator of the Year
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
How to Maximize Factory Automation Efficiency with Low Cost Machine Vision
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Wireless Reliability in Harsh Environments
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
Human Factors and the Impact on Plant Safety
This eGuide contains a series of articles and videos that considers theoretical and practical; immediate needs and a look into the future.
July 2018
Ladder logic best practices and object-oriented programming, safety instrumented systems, enclosure design issues and challenges, process control advice
June 2018
Discrete and process sensor fundamentals, autotuning controls, system integrator roundtable
May 2018
Salary and Career Survey, IT and OT convergence, robotic standards and safety, secure circuit protection
Edge Computing
This article collection contains several articles on how today's technologies heap benefits onto an edge-computing architecture such as faster computing, better networking, more memory, smarter analytics, cloud-based intelligence, and lower costs.
Data Center Design
Data centers, data closets, edge and cloud computing, co-location facilities, and similar topics are among the fastest-changing in the industry.
PLCs
Programmable logic controllers (PLCs) represent the logic (decision) part of the control loop of sense, decide, and actuate. Featured articles in this digital report compare PLCs and programmable automation controllers (PACs), industrial PCs, and robotic controllers.
SIDB

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

June 2018
Machine learning, produced water benefits, progressive cavity pumps
April 2018
ROVs, rigs, and the real time; wellsite valve manifolds; AI on a chip; analytics use for pipelines
February 2018
Focus on power systems, process safety, electrical and power systems, edge computing in the oil & gas industry
John O. Ayuk, PE, CFSE, PMP, CAP
Automation Engineer; Wood Group
Doug Baker
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
Data Centers: Impacts of Climate and Cooling Technology
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
Safety First: Arc Flash 101
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
Critical Power: Hospital Electrical Systems
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
Engineers' Choice Awards
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by Control Engineering subscribers.
System Integrator Giants
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
System Integrator of the Year
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
How to Maximize Factory Automation Efficiency with Low Cost Machine Vision
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Wireless Reliability in Harsh Environments
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
Human Factors and the Impact on Plant Safety
This eGuide contains a series of articles and videos that considers theoretical and practical; immediate needs and a look into the future.
July 2018
Ladder logic best practices and object-oriented programming, safety instrumented systems, enclosure design issues and challenges, process control advice
June 2018
Discrete and process sensor fundamentals, autotuning controls, system integrator roundtable
May 2018
Salary and Career Survey, IT and OT convergence, robotic standards and safety, secure circuit protection
Edge Computing
This article collection contains several articles on how today's technologies heap benefits onto an edge-computing architecture such as faster computing, better networking, more memory, smarter analytics, cloud-based intelligence, and lower costs.
Data Center Design
Data centers, data closets, edge and cloud computing, co-location facilities, and similar topics are among the fastest-changing in the industry.
PLCs
Programmable logic controllers (PLCs) represent the logic (decision) part of the control loop of sense, decide, and actuate. Featured articles in this digital report compare PLCs and programmable automation controllers (PACs), industrial PCs, and robotic controllers.
SIDB

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

June 2018
Machine learning, produced water benefits, progressive cavity pumps
April 2018
ROVs, rigs, and the real time; wellsite valve manifolds; AI on a chip; analytics use for pipelines
February 2018
Focus on power systems, process safety, electrical and power systems, edge computing in the oil & gas industry
John O. Ayuk, PE, CFSE, PMP, CAP
Automation Engineer; Wood Group
Doug Baker
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
Data Centers: Impacts of Climate and Cooling Technology
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
Safety First: Arc Flash 101
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
Critical Power: Hospital Electrical Systems
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
click me