Managing manufacturing information with data cleansing

Implementing data science and analytics techniques by using data cleansing can help companies in areas such as quality, efficiency and cost reduction.

By Matt Coleman August 22, 2022
Image courtesy: Brett Sayles

Analytics Insights

  • Data cleansing is designed transforms the data that already exists into a more consumable and approachable format that doesn’t overwhelm the user.
  • This is a critical step in shaping and molding the data so it can be used in modeling for everything from analytics to predictive maintenance and a lot more.

There is so much data in the manufacturing process it is often overlooked. When we think of manufacturing, we often think about the tangible production of a product as the primary goal. While this is certainly true, many steps of the production process generate valuable data as a byproduct. Areas within manufacturing like procurement, operations, and engineering all generate an abundance of consumable data waiting to be utilized. Implementing data science and analytics techniques with data cleansing can help companies in areas like quality, efficiency and cost reduction.

Most data science and analytics implementations follow the same road map, starting with identifying a problem that needs to be solved. For example, do you have issues with failing equipment on your line, and do you want to predict the failure before it happens? Is your process producing more scrap than you want? Do you need a real-time dashboard to track overall equipment efficiency (OEE)?

Once a potential improvement is identified, the next step is collect relevant raw data or data that has not been processed for use. There are times when this data is readily available, but there are also situations where we must track the information down or start collecting it from scratch. Some examples of raw data within manufacturing would be a data historian or a repository of data from an enterprise resource planning (ERP) or manufacturing execution system (MES).

Benefits of data cleansing

While raw data can undoubtedly be helpful, there’s often an extra processing step called data cleansing that transforms the data into a more consumable and approachable format. Some data cleansing steps include aggregation, transformation, and removal of missing data. The main goal of data cleansing is to prepare the data in a way that can be easily consumed by a predictive model or analytics package. Another advantage of data cleansing is size reduction, resulting in computational advantage and lower data storage costs.

The data cleansing process can be time-consuming, but it’s a critical step in preparing the data for modeling. The modeling step returns to the original question of what problem we are trying to solve. Many different predictive modeling techniques can be used depending on what outcome you are going for. For example, identifying the probability of a failure is a binary classification problem. Utilizing image classification to look for defects in a product might use a machine learning algorithm like a convolutional neural net (CNN). The modeling step of the data science process is where the magic happens.

The final stage of a manufacturing data science implementation is to present findings. Utilizing BI dashboards, which consolidate results into a single concise slide of helpful information, is the best way to execute this. These dashboards can be descriptive, meaning they look at historical data, but they can also look at data in real time. The information displayed on the BI dashboard often focuses on specific metrics related to the problem we are trying to solve or the improvement we want to make. The main advantage of real-time dashboarding is reacting to an event as it occurs, often resulting in cost and quality savings.

Those are the steps we take to use analytics to unlock the volumes of manufacturing data. It is a structured process, but the resulting insight and actionable ideas are well worth the effort.

– This originally appeared on Avanceon’s website. Avanceon is a CFE Media and Technology content partner. Edited by Chris Vavra, web content manager, Control Engineering, CFE Media and Technology,

Original content can be found at Avanceon.

Author Bio: Matt Coleman, analytics practice lead, Avanceon