Processing ‘One Version of Truth’ Improves User Confidence

Automation makes it easy to collect data, but is it the right data? Have key metrics been defined and are those metrics being monitored? Is reducing process variability the goal? Is data actually analyzed to gain information; or is data simply taking up a hefty chunk of storage space?Data accessibility is a major obstacle that must be overcome if the continuous improvement journey to achi...

By Bradley Klenz, SAS Institute January 1, 2001
  • Software and information integration

  • Data acquisition

  • Data warehousing

  • Enterprise resource planning

  • Information systems

  • Manufacturing execution systems

  • Quality assurance

Forewarned is forearmed!

Automation makes it easy to collect data, but is it the right data? Have key metrics been defined and are those metrics being monitored? Is reducing process variability the goal? Is data actually analyzed to gain information; or is data simply taking up a hefty chunk of storage space?

Data accessibility is a major obstacle that must be overcome if the continuous improvement journey to achieve Six Sigma quality across the enterprise is to be successful. If data is to be converted into the information on which sound business decisions are made, then data collection and exploitation strategies must be responsive, reliable, and efficient.

A major part of overcoming the accessibility obstacle is understanding the functional requirements—how, when, where, and who will use the data. Without understanding functional requirements the usefulness of the physical data warehouse is unlikely to meet user needs. For production operations establishment of a quality data warehouse (QDW) is essential in delivering Six Sigma products and services.

Working together, process engineers and IT representatives define the functional requirements of quality-related data to ensure the physical implementation of QDW supports defined business objectives. (See related articles in this issue.)

Implementation of a quality data warehouse requires three levels of capability:

  • Networks, hardware, and software to connect, collect, validate, and store data;

  • Production and operational transactional systems to generate data; and

  • Decision support systems comprised of analytical and knowledge management applications to aid making timely business decisions.

Technology isn’t the challenge

At the decision support level, process engineers apply their experience and judgement to reduce process variability’s using quantitative information. But the decision support level only performs when the first two levels are firmly in place.

Inefficient or insufficient data collection, data stored but not used, data not collected or collected using the wrong time frequency, and data existing in disparate systems are all examples of how costs can be added without improving the return on the investment. For example, maintenance and repair service databases have been around much longer than supply chain management databases and thus may not be easily compiled at the decision support level.

The networks, hardware, and software must connect production and operation transactional systems to support accurate and timely information for making knowledgeable business decisions.

The level within the organization where business decisions are made also impacts the definition of a “real-time” data warehouse. For example, production decisions made by operators affecting product quality requires data be collected, analyzed, and presented in the millisecond to minute time domain. An operator at the final distribution point of the enterprise, requires quality related data collected, analyzed, and presented in a completely different aggregation and time domain.

What this means to the IT representatives developing the QDW is a need to understand the requirements of all data users in terms of:

  • How complete are the connections between departments and domains?;

  • What disconnects might interrupt the flow of data into informational systems?;

  • What aggregation of data is required?; and

  • What is the impact to data integrity when data moves across department and division boundaries that use different data aggregation and time domains?

Collecting and accessing data represents its own process and requires having metrics in place to quantify the quality of data access and collection activities. Simply stated, a data warehouse provides:

  • Means to access metrics;

  • Data structure designed for analysis; and

  • Links to the many operational and production systems within an organization.

  • Operational systems will not…

Much of the data needed for quality analysis resides in the many operational systems a company uses. For example to determine the quantities and reasons raw material were returned to a supplier, requires accessing the materials management section of an enterprise resource planning system (ERP).

To learn what test were conducted and what parameters failed the test causing the raw materials to be returned to the supplier might require access to the laboratory information management system (LIMS). Identifying which quality engineer conducted the test and decided not to use the material may require accessing the quality management section of the ERP system. This example illustrates how decision making requires collecting subject-related data from different sources, then presenting that data in an appropriate format (See Relating Architectural Structures diagram).

Operational systems handle day-to-day workings of the business. They provide current snapshots of what is happening in the business, inform what and where materials are available, and notify what products are currently in production, what products customers have ordered, and the production schedule needed to fill those orders.

What operational systems will not do is analyze data to allow higher-level decisions to be made. For example, supplier quality analysis requires determining:

  • Did supplier’s variability changed significantly in the last six months?

  • Have changes affected production schedules?

  • Does this supplier’s product quality significantly differ from other suppliers? and

  • How are suppliers affecting cost of poor quality metrics?

Divide and conquer

Operational systems are designed and optimized to handle the transactions of running the business. For example, when a production run ends and a new product is to begin, the operator needs instant access to the production schedule and the required settings for the new setup. This need differs from that of the quality engineer who wants to determine how materials from various suppliers have performed using the initial setup parameters.

Such an analysis would draw data from a longer time span than the daily production schedule. This analysis would also access subject-oriented data from a variety of locations within the organization. Since operational systems are not optimized for these query types, getting the results would be difficult. Because different departments or divisions making the same products may use different operational systems, data from different sources must somehow be combined. Also, constant query of the operational system may cause performance degradation and adversely affect production. Use of a data warehouse methodology recognizes the need to approach operational needs and decision support differently.

Because the data warehouse exists separately from the operational systems, a process must be created to populate the data warehouse. It’s worth noting that not automating this data flow frequently leads to failure. Successful data warehouses use a process-based administrative system to move data from operational systems to data warehouse. Such a system allows the load process to be scheduled to run when it can be accommodated by the operational systems. Use of data warehousing provides several benefits to an organization’s continuous improvement process. First, metadata (data about data) are stored in the warehouse and enable improvement project leaders to know what data has been collected.

Second, data within the warehouse undergoes the extraction, transformation, and loading process assuring that clean data is ready for analytical use by those needing data. Business rules can also be associated with the data providing such benefits as automatic subgrouping of data and classification of assignable causes. The result of process analysis projects can be added to the warehouse to allow process improvements to become part of standard operating procedures.

Sources of continuous improvement inputs can come from a wide variety of sources, including suppliers. Working to tightly integrate a supplier requires understanding the full impact of that supplier’s data on your own systems. Just like the complexities encountered getting data into the decision support systems, the tighter the integration with suppliers the greater the complexity.

Information delivery

A data warehouse delivers information across the enterprise with “one version of the truth.” This allows meaningful comparisons among plants, production lines, and products. Data is transformed into information meaningful for all decision-making levels in the company. For the IT staff, data exists in a clean, consistent, and documented format. For the process engineer, data is convenient, in a common format, and if desired, exportable to other common formats.

With data from each stage of the production process readily available, it becomes possible to explore relationships up-and-down production lines, such as determining how variation in earlier stages of the process affect later stages. This can lead to scrapping or reworking materials much earlier in the production process and may identify changes that avoid the need to rework. It also becomes easier to identify the influence of critical performance factors.

The data warehouse enables guided and informed decisions to be made by production line workers and supervisors. Historical data can provide a short list of previous problems that are consistent with current operating conditions. Not only are problems known, but also process engineers and line operators can view corrective actions previously taken and results obtained.

Information delivery

The data warehouse is the foundation of analytical decision support. From the data warehouse, a number of data and information marts can be created and populated to support specialized analytical needs.

A data mart contains data files of clean data in an efficient, ready-for-analysis format. Any business rules, such as how to assign materials to production batches, have been applied in creating the data file.

An information mart contains optimized, ready for analysis data, reports, charts, and user interfaces developed specifically for the mart’s user community. In some cases, a custom “fat client” interface is written to navigate the information mart and enable users to act on their decisions from within the interface.

To better understand data and information marts, consider:

Comparing results among different plants: A food processing company has multiple production lines that allow products from several different recipes to be produced, based on current demand. Variations in ingredients for different recipes affect settings, such as flow rate and dryer temperature. Using a web browser interface, an information mart could allow shop-floor personnel, at any plant, to review past production runs with similar characteristics.

Electrical test data analysis: In semiconductor and electrical component production, electrical testers produce a large volume of data for each component tested. Engineers need a central database containing test procedures used, test results, and tag data that identifies lot, wafer, and die information-a perfect data mart application where the user interface allows subset creation for desired tag values.

Support for Six Sigma projects: To prevent duplication of effort, Six Sigma project leaders need to know what relevant improvement data is being collected, how clean the data is, and how it is formatted. Data warehouse data has already been cleaned and appropriate business rules, such as grouping of assignable cause terms, have been applied. The data warehouse enables the project leader to export data to whatever analysis environment is best suited. The results of Six Sigma projects can be added to the data warehouse to allow process improvements to become part of standard operating procedures.

Other data and information mart possibilities include enterprise-level SPC (statistical process control), cost of poor quality analysis, total cost of manufacturing, and yield analysis.

Sources of data

Transactional systems provide the majority of data for a process analysis data warehouse. The formats of data in the respective systems are (and should be) tailored to the transactional needs addressed by each particular system. Example sources for a process analysis data warehouse include:

  • Process measurement data from statistical process control systems, including out-of-control cause indications and corrective actions taken;

  • Production scheduling data from material or enterprise resource planning systems;

  • Material data from supply chain systems, including vendor-supplied quality characteristics;

  • Production execution data from manufacturing execution systems;

  • Quality assurance lab data from laboratory information management systems:

  • Customer data from call centers; and

  • Warranty related data.

In striving for continuous process improvement and Six Sigma quality, there is an easier way for process engineers, and that is to partner with IT to jointly develop a subject-oriented process analysis data warehouse that serves the operational orientation decision-making needs of process engineers.

For information from related suppliers, go to , under Software: Database/historian/data warehouse.

Author Information
Bradley W. Klenz is practice manager at SAS Institute. His responsibilities include the development of methodologies and integration of technology for business process management solutions.

Forewarned is forearmed!

Plenty has been written about the benefits of building data warehouses, but it’s not as easy as it sounds. Here’s a few of the more common “gotchas.”

Extracting, cleaning, and loading data can take as much as 80% of construction time.

Data warehouse building will reveal undetected problems in transaction processing systems. For example, the data warehouse could be setup to collect increasing levels of product oriented data, that is not consistently included in data received from the transactional system.

Data will be required that doesn’t exist. For example, a sales reporting data warehouse could require off-invoice adjustments not recorded in the order entry system.

Data inconsistencies will be much greater than anticipated. For example, a customer names database may contain GE and General Electric.

Many users will be trained in using the data warehouse, but few will apply what they learn.

Users will develop conflicting business rules. For example, if you are summarizing beverage sales by flavor, the flavor category includes cherry and cola, and if there is a cherry cola brand the possibility exists of classifying the brand in different categories.

Data warehouse accessibility is a paradox. The more accessible the data, the more users will want to use it, the harder it is to enforce data security.

Excerpted from material located at