How to use human and artificial intelligence with digital twins

Industrial Internet of Things (IIoT), artificial intelligence (AI), user interface technologies such as augmented reality and virtual reality can help the form and function of digital twins to improve training, operations and outcomes.

By Michael Thomas, Brad Klenz and Prairie Rose Goodwin October 13, 2020


Learning Objectives

  • Digital twins get help from augmented reality and artificial intelligence. 
  • Data visualization tools date back to the 1700’s; AR and VR are modern extensions. 
  • Visualization and reporting can be enhanced with digital twins. 

Human intelligence has been creating and maintaining complex systems since the beginnings of civilizations. In modern times, digital twins have emerged to aid operations of complex systems, as well as improve design and production. Artificial intelligence (AI) and extended reality (XR) – including augmented reality (AR) and virtual reality (VR) – have emerged as tools that can help manage operations for complex systems. Digital twins can be enhanced with AI and emerging user interface (UI) technologies like XR can improve people’s abilities to manage complex systems via digital twins.

Digital twins can marry human and AI to produce something far greater by creating a usable representation of complex systems. End users do not need to worry about the formulas that go into machine learning (ML), predictive modeling and artificially intelligent systems, but also can capitalize on their power as an extension of their own knowledge and abilities. Digital twins combined with AR, VR and related technologies provide a framework to overlay intelligent decision making into day-to-day operations, as shown in Figure 1.

What’s needed to form and feed a digital twin?

The operations of a physical twin can be digitized by sensors, cameras and other such devices, but those digital streams are not the only sources of data that can feed the digital twin. In addition to streaming data, accumulated historical data can inform a digital twin. Relevant data could include data not generated from the asset itself, such as weather and business cycle data. Also, computer-aided design (CAD) drawings and other documentation can help the digital twin provide context. AI and other analytical models can take raw data and process it into forms that help humans understand the system.

AI also can make intelligent choices of content on the user’s behalf. Such guidance could be very welcome to users because user input facilities are very different from the typical keyboard and mouse. As displayed in the upper right corner of Figure 1, humans can perceive the system as an intelligent reality – a technologically enhanced reality that can aid their cognition and judgement.

With the blueprint in Figure 1 as a basis, it’s possible to create digital twins that use AI and reality technologies to achieve operational benefits. Any number of operations could be enhanced with the techniques described here.

For example, the paper “Augmented Reality (AR) Predictive Maintenance System with Artificial Intelligence (AI) for Industrial Mobile Robot” details how a machine learning model can be used to classify the state of a robot motor which can then be presented to factory personnel with AR. This article applies the blueprint concepts to facilities management after first exploring each concept in depth. While the various data streams reach their conclusions in human perception, the starting point of a digital twin for a user is how it is perceived. Thus, the starting point for this exploration are user interfaces for digital twins, followed by a discussion of AI.

Human reality of digital twins

Humans have a long history of interfacing with data and data visualization, starting with William Playfair’s inventions of line, bar and pie charts in the late 1700s. Digital twins can present data in such familiar forms, but the traditions of the late eighteenth century should not restrain the digital twin’s power.

When using mobile technologies such as tablets, smart phones and AR headsets, the digital reality is overlaid on the physical reality into one view, as shown in Figure 2. AR headsets may be the obvious choice for this use case, but it is not the only one. Traditional interfaces rendering 3D models also allow workers to take advantage of digital twins.

The first step in considering the creation of intelligent realities for digital twins is understanding data visualization options across the user interface (UI) spectrum. Next, a reporting integration approach is considered which can operationalize analytics and AI without requiring a new hardware paradigm, like an AR headset. AR headsets have the potential to benefit operations, but only if applications are successfully designed for usability, which is the next consideration. An outline follows of how to build a digital twin interface for remote experts.

Visualizing digital twin output across the UI spectrum

In Cap Gemini’s “Augmented and Virtual Reality in Operations” report, Jan Pflueger from Audi’s Center of Competence for AR/VR encouraged a business-first approach for reality projects. “First, focus on your use case and not on the technology itself. After you identify your use case, focus on your information handling and data so you can deliver the right information to the technology.”

Consider five technological approaches for rendering digital twins and their respective capabilities. These are traditional desktop; smart phone or tablet; monocle AR; stereoscopic AR, including mixed reality (MR) devices; and immersive VR. See figure 3 table for comparison.

Within each class of device, capabilities vary, and the variance may affect a product’s viabilities for different use cases. This is especially true for AR headsets. Display resolution, field-of-view and computational power differ from product to product. In addition, design decisions about whether to put battery and compute units on the headset or on a separate tethered module can affect comfort and practicality. One practical concern for AR headsets is how they integrate with work clothing and uniforms such as those required for clean room and food processing operations.

Reporting with a digital twin context

Given an interactive visual analytics application, intelligent reality reports can be created with integrated 3D models like the one shown in Figure 4. The digital twin presents a custom visualization that can interact with other objects in the report, including showing data in a table or graph.

This visualization approach adheres to long standing data presentation traditions without requiring new hardware beyond a regular desktop setup. The user interface is presented on a typical computer with a mouse and keyboard. Users need little additional training to use the power of the digital twin.

Usability and augmented reality

When moving beyond the desktop into AR headsets, application designers face a new set of usability challenges. Usability is the cornerstone for any technology to be a tool rather than a hindrance. While AR is a new interaction paradigm, the long-standing standards of usability still apply. These should guide efforts to integrate AR with a digital twin. A good interface is efficient, learnable, memorable, error-infrequent and pleasant to use. Leveraging domain knowledge with the advantages of AR and digital twins makes this space situated to maximize usability in many of these categories.

Learnability of augmented reality

Users on AR platforms show significant improvements with minimal instruction over a short period of time. Some users find AR is initially difficult because they cannot rely on their intrinsic knowledge to operate the system. This setback is temporary, however and users often improve. Learnability varies significantly based on the target audience. Tools made for experts have a higher learning curve but are more powerful overall, and expert efficiency should justify the extended training period.

Efficiency of augmented reality

AR has the effect of “embodiment” when the technology fades away and becomes an extension of our senses. When this happens, the technology extends our sensory, cognitive and motor limitations, so we spend fewer cognitive resources thinking about the interface and more on the task at hand. The information in the world combined with information in the program improves the efficiency of a task through embodied cognition.

Learning advantages of augmented reality

Users are more engaged with the content when accessing it in an AR system because of the novelty. Engagement is a key factor in thinking critically and remembering details. When a user comes back to the interface after a period of inactivity, they will be more likely to remember the content and the actions.

Low error rate of augmented reality

As with any application, the error rate is often affected by the interface design. A good interface designer will be able to create a user experience well within the limits of human factors, and this applies for AR. While the interaction paradigm is different from point and click, system designers have considered the kinds of inputs that can be recognized and limit the amount of irrecoverable errors during use.

Satisfaction of augmented reality use

Multiple studies have indicated users prefer AR over traditional interaction paradigms. Satisfaction is the culmination of learnability, efficiency, memorability and the ability to use the system without catastrophic errors. When the other categories of usability are well-balanced, the user will be satisfied with their experience.

Remote experts and digital twins

An AR device can help a field worker by overlaying their reality with digital twin output. The same AR device can also serve as a platform for remote expert assistance. The simplest way is to transmit the video feed from the AR device’s video cameras to the remote expert; however, the video feed alone would not be able to go beyond what the field worker can physically access and does not contain live Internet of Things (IoT) sensor data.

Instead of relying on video, the remote expert could view the system as a virtual world. A VR or MR headset could be used, but a traditional flatscreen would also work well. The picture below shows a digital twin created with the gaming engine.

In this kind of VR system, the expert can change to a vantage point the field worker cannot see. For example, they could view the system from any angle, go through locked doors or even go inside of components.

Unlike the field worker’s digital reality, the expert’s reality must be created. It could be created by a 3D artist or with digital artifacts such as CAD drawings – or some combination of the two. An artist would have full control, but using CAD drawings would be more scalable.

While it is technically possible to convert CAD drawings directly for use in gaming engines, CAD drawings tend to be too detailed for real-time rendering in a gaming engine. CAD is purposed towards creating models that can be handed to manufacturing or building, while gaming engines pursue photo realism, believable lighting and low latency response to changes in camera position.

Tools exist to optimize CAD drawings for virtual engines.

From IoT sensors to artificial intelligence

With IoT, data is collected from sensors on a device, on neighboring devices, the environment around a device and whatever interacts with the device. The speed is real time, and connectivity often allows us to span distances instantly. Advances in streaming analytics now enable us to process this real-time data using machine learning and artificial intelligence.

While very simple systems can be twinned from raw data readings, AI and other analytical techniques are necessary to make a human-consumable digital twin of complex systems. When considering a vehicle only as an object on a map, then a digital twin can be very simple and easy to digest. There are only two variables, latitude and longitude, and the variables are easily understood by humans.

But when twinning the operation of the vehicle requires hundreds of megabytes of data per second and thousands of variables. While all that data is important for the operations of the vehicle, that much raw data would overwhelm the ability of a human to make sense of it. AI synthesizes the data so the digital twin can present it in a human consumable format. Conversely, AI enhances the digital twin experience by providing additional information about the environment not otherwise available to the user.

Underneath the umbrella term of AI are several specific categories of machine learning. Consider the next section a digital twin toolbox. First, the general architectural practices of AI are presented, and then specific deep learning techniques are reviewed.

Common practices of creating artificial intelligence

AI must become intelligent somewhere, and it’s usually not “on the job.” Deep learning models are trained on large databases and are almost always done offline. It is not unusual to take hours or days to train a model. Once the model is trained, the model application through inferencing is less compute-intensive but still requires more compute resources than is typical for digital-twin applications.

For some applications, near real-time or slightly delayed results are sufficient. For example, in the computer vision defect detection described below, it might be acceptable to hold a production batch while the defect detection is performed. In other cases, real-time inferencing is needed. Inferencing can be done in the cloud or data center where sufficient resources are available. For edge inferencing, edge gateways are becoming available with sufficient compute power, but that specialized need requires planning.

Recurrent neural networks

Recurrent neural networks (RNNs) are a special class of deep learning neural networks designed for sequence or temporal data. Within IoT and digital twins, there are many examples of such sequence and temporal data. Many sensors are collecting data over time. The sequence or pattern of the measurements over time can be used to understand interesting characteristics of the digital twin asset. One example is measuring energy circuits in a smart building or power grid. The pattern of the energy use on a circuit can capture the start or end of an asset operation such as a motor start, which signals an operation change in the digital twin asset. Another use of RNNs is for forecasting unusual time series data. An example is forecasting the energy output from a solar farm, shown in Figure 6.

In this case, there is a cyclical component that could be forecasted using traditional methods, but there is a less well modeled component of weather and cloud cover. With the large amount of data available from the solar farm and nearby solar farms, a deep learning RNN can capture the more sporadic aspects of the energy output.

How to train a recurrent neural network

The process for training an RNN is different when working with sequence data versus working with temporal data. The process for training the RNN with sequence data is as follows:

  • Break the data into segments of sequential measurements. The length of the segment is determined by the time interval of the data and the expected duration of the precursor to an event. For the energy circuit example in smart buildings, the data is collected at 5-second intervals, and we use the previous minute of data.
  • Create a target variable for the events of interest and use it to label the sequences where the event occurs. For our example, we are using motor starts and identifying weak motor starts indicating capacitor failure.
  • Train the RNN. Bidirectional model fitting is not needed in this case because measurement data is always moving forward in time.

The trained model can then be deployed for inferencing. In most cases, the model inferencing function will be sufficiently fast to be used on the real-time measurement stream, either in the cloud, server or edge device.

Recurrent neural network forecasts

The second type of RNN is used to forecast. The example in this case is to forecast the energy output of a solar farm for short time periods in the future (1 hour). The key in this case is to create a set of lagged variables for the predictors and the response variable. The response variable is the energy produced.

To train this RNN, take the historical input database and create lagged variables for the predictors and response variable. The number of lags is determined by the time interval of the measurement data and the expected correlation of previous measurements on the forecast time horizon.

For the solar farm example, we are producing one-hour-ahead forecasts, and the data over the last few hours is sufficient to capture the primary effects for the forecast. There are a large variety of conditions possible throughout the year and previously-observed weather, even though the forecast horizon is short. Since we have a large amount of historical data of the various conditions, using an RNN is appropriate for this particular problem.

Since training and evaluating the RNN model is dependent on the sequence, partitioning the data requires more care than typical random partitioning. In this case, we need to preserve the sequence of the data for use in the model creation steps (training, validation, test). The easiest way to do this is to partition the data based on the time variable. Use the earliest historical data for the training data set. Then use the next time partition for the validation data set.

Finally, use the most recent data for the testing data set. This is sufficient if the performance of the asset has been consistent over the historical data sample. If there have been periods of degraded performance, it is best to eliminate that data from the data sets used to create the model.

Use RNNs for one-step-ahead forecasting where the forecast interval matches, or is less than, the desired forecast interval. This yields the most accurate forecast. In some cases, a multistep forecast may be required to project future time periods based on the near-term forecast estimates. These forecasts are often less accurate but can be tested to determine if they have sufficient accuracy.

Reinforcement learning, machine learning, HVAC

Reinforcement learning (RL) is a subfield of machine learning and deals with sequential decision-making in a stochastic environment. In any RL problem, there is at least one agent and an environment. The agent observes the state of the environment and takes and executes a decision. The environment returns a reward and a new state in response to the action. With the new state, the agent takes and executes another action, the environment returns a reward and new state and this procedure continues iteratively. RL algorithms are designed to train an agent through this interaction with the environment, and the goal is maximizing the summation of rewards.

RL has received much attention due to its successes in computer games and robotic applications. Besides simple RL applications, there are still few real-world applications of RL to increase efficiency. We studied and did some research to extend an RL algorithm for controlling the heating, ventilation and air conditioning (HVAC) systems. HVAC includes all the components that are supposed to maintain a certain comfort level in the building.

Buildings consume 30 to 40% of all consumed energy in the world, so any improvement could result in huge savings in energy consumption and carbon dioxide release. Advances in new technologies in recent years have improved the efficiency of most components in the HVAC systems. Nevertheless, there are still several directions to reduce the energy consumption by controlling different decisions on these systems.

We considered a multizone HVAC system and selected the amount of air flow as the main control decision. Using the obtained data from an SAS building in Cary, N.C., we trained an environment and used it to train an RL algorithm when there are 10 zones in the system with a set-point of 72 with ±3 allowance.

The figure below shows the results of 50 cases with different initial temperatures. The upper figure is the temperature and the lower figure is taken actions over 150 minutes in which every three minutes a decision is taken. We compared this result to the commonly-used rule-based algorithm (in which the system is turned on/off at 69/75), and RL obtained 47% improvement on combination of obtained comfort and energy consumption.

Hyperparameter tuning for deep learning

For all deep learning methods, hyperparameter tuning is an important step. Hyperparameter settings are often dependent on the domain knowledge of the application. Research into the specific application can yield a set of parameter settings to be tested. In some cases, a set of parameter settings has been established as best practices. In other cases, research is needed to determine the best settings.

A feature in software for visual data mining and machine learning is hyperparameter autotune. This feature will take a range of potential parameter settings and perform an optimal search for the best performing settings. This will greatly help cases where research is needed on the parameter settings.

Machine vision and digital twins

Computer vision or machine vision is a powerful tool that has caught the attention of many with its ability to recognize faces and objects within a scene. For digital twins, it can add important information about the quality of the things being monitored. A task that requires visual inspection could be enhanced with an AR interface to a digital twin. For example, computer vision can detect defects by comparing thousands of images for anomalies that may not be as detectable by a human. Moreover, specialized cameras, such as infrared, allow for even further analyses by combining multiple streams of information. A process for implementing a computer vision model is as follows.

If possible, fix the camera to a stable mount point so that all images will be taken from the same angle and with the same proportions. This simplifies the model training compared to general object recognition models, which must capture objects from many angles. The fixed camera location also simplifies the process of determining the location of defects on the piece.

Create a digital twin with machine vision

Another option is creating a model that finds easily identified features on the piece. For a power substation, it’s possible to have general instructions on how to point the camera at a transformer in the substation. An object recognition model could identify the bushings on the top of the transformer. This would provide reference points to scale the images with images captured at similar angles, similar to how facial recognition models determine various key points on a face.

Resulting images can create a classification model using convolutional neural networks (CNN). Depending on how well-labeled the data is, models can be of various complexity.

With a collection of mostly good images, a binary classification model can be created that identifies images with a high likelihood of known good or suspected anomaly images. The power transformer is an example of this.

By having images labeled with known defect types, it is possible to create a more complex classification model that identifies the various defects. The discrete parts are an example of this. There might be previous images labeled with an incorrect bearing insertion and other images labeled with incorrect part milling.

With good location identification, it’s also possible to break down the images and find the portions of the image with defects. The semiconductor wafer is an example here. Expected yield can be quantified based on the proportion of the wafer with defects.

After the model is trained, determine the inferred latency and test new images being captured and if there’s a need to stream image-by-image and get immediate results. It might also be possible to capture a batch of images and process in batch. Also determine if the inferencing can be done in the cloud or server or if an edge gateway is needed.

Digital twin applications for smart facilities

Smart facilities offer a perfect example of how a digital twin can offer features that cannot be accomplished in the physical world. Although buildings are becoming smarter and smarter, walls cannot yet turn transparent on command. If a building administrator wants to look through walls, AR and digital twins provide this. In this case, the information pipeline goes through the following steps.

Raw data gathering for a digital twin

An air-handler can have hundreds of sensors monitoring things like duct pressure, valve-positions, outside air temperature and power draw. Traditionally, these systems are left alone until they break down or need maintenance; however, that approach does not consider how efficiently the system is running and offers little insight into problems or places for improvements. The data streams generated from the sensors can improve maintenance. Other sources of raw data include thermometers, motion detectors and signal monitors for Wi-Fi and other wireless networking.

Digital twin models: AI, analytical models

Models can be trained to twin the system and alert administrators when the digital model does not match the physical performance. This approach minimizes downtime and helps pinpoint issues. The end user does not need to know all the math behind the model, but the twin can be crucial to separating important information from the noise. For example, the model can detect when the power draw of the air-handler is more than expected given the outside weather.

Always aware building management via digital twin, AR

With the raw data processed into intelligence, many options are possible for rendering the digital twin output. Most importantly, the physical and digital realities can be combined for a facility manager so they are situationally aware. When AR headsets are used, the digital twin can be merged into the physicality of the facility as a manager moves around the facility. Just as managers can take note of physical flaws and issues, they can use AR to see into walls, make invisible Wi-Fi coverage visible and see temperature differences throughout the facility.

While an alert-driven approach based on defined thresholds would remain important, a situationally aware approach lessens the chances alerts would be surprising to a manager. Also, a manager can use intuition and judgement to prioritize issues that may fall in blind spots of defined rules.

A situationally-aware approach is possible due to the advent of lightweight AR headsets with strong battery life. Without modern AR headsets, the digital reality of the facility is only visible at a desktop or perhaps on a tablet; but even with mobile use of a tablet, the manager would still have to operate the computing device as they move about. This approach is not heads-up and hands free. With an AR headset, the digital output is ambient as the manager moves about.

Since a digital twin of a facility can produce a lot of output, it is likely it could produce many visual representations in the same physical space. A digital twin AR application that attempts to render all possible information for that space would not be usable. An AI agent can select the most pertinent digital twin output based on several criteria such as the manager’s role, newness of information, situational urgency and the manager’s history of interest.

Digital twin, IoT, AI, AR and other user interfaces

When properly architected and integrated under the intelligent realities umbrella, IoT, AI and UI technologies can open new possibilities, and digital twin provides a usable representation to consume the massive amounts of information inherent in such an architecture. Various UI options are available for interacting with digital twins. AR and VR are included, but more traditional options like tablets and desktop computers also should be considered.

Michael Thomas is senior systems architect; Brad Klenz is distinguished systems architect; and Prairie Rose Goodwin is senior product developer with SAS Institute, an Industrial Internet Consortium (IIC) member. IIC is a Control Engineering content partner. Edited by Mark T. Hoske, content manager, Control Engineering, CFE Media,

KEYWORDS: Artificial intelligence, digital twins, virtual reality


Are you accelerating with next-generation technologies or watching competitors pull away?


Link to the full PDF IIC article,” Artificial and human intelligence with digital twins” with URLs for 18 references.

Author Bio: Michael Thomas, Brad Klenz and Prairie Rose Goodwin, SAS, Industrial Internet Consortium