Achieve fault tolerance with a real-time software design

Data Distribution Service (DDS) specification from Object Management Group (OMG) is a data-centric publish/subscribe (DCPS) messaging standard for integrating distributed real-time applications. Here’s how process replication can increase a system’s fault tolerance.

04/29/2013


The most basic approach is to run multiple, identical Manager applications. Courtesy: Real-Time InnovationsData Distribution Service (DDS) specification from Object Management Group (OMG) is often used to build mission critical systems consisting of multiple components executing simultaneously and collaborating to achieve a certain task. In such systems, there are usually several components that are critically important for the overall functioning. Those components require extra attention when designing the system to ensure that the likelihood of them failing is minimized. Its focus on reliability, availability and efficiency make the DDS infrastructure particularly suitable to connect the plant floor with other areas of the enterprise.

DDS provides advanced features that help achieve such goals. While there are many methods, fault tolerance for failing publishing applications or machines can be achieved by means of active process replication.

Guiding example

For the purpose of illustrating the theory here with an example, the use case of the simple distributed finite state machine (FSM) is described. The FSM consists of a few components: a Master Participant, a Worker Participant, and a Manager. The Observer component, not essential for the functionality of the pattern, is passively observing only. This example focuses on the interaction between the Manager and the Observer.

Let’s assume that the Manager is the most critical process in the pattern, which is likely to be the case, especially if the pattern is expanded to support multiple Masters and Workers. The most obvious way to make the Manager component more robust is by running multiple processes simultaneously, all with the same responsibility. With that approach, the Manager component is defined as a conceptual item that may consist of multiple, simultaneously running application processes. For the sake of simplicity, the collection of those processes is referred to as the Manager component. Note that those could be executing in a distributed fashion on multiple machines.

Active process replication

Simultaneously running multiple Manager processes with the same purpose will decrease the likelihood that the Manager component will break down completely. DDS allows for adding and removing participants on the fly without requiring configuration changes, so the basic concept of process replication is supported. Still, there are multiple options from which to choose.

Plain process replication

To avoid disadvantages, another option is to make Manager processes aware of each other at the application level. Courtesy: Real-Time InnovationsThe most basic approach is to run multiple, identical Manager applications. Each acts the same, receiving the same transition requests and publishing the same state updates in response. As a consequence, applications observing the state machine will see multiple instances of that state machine simultaneously. This approach requires a minor adjustment to the data-model to allow multiple task instances to exist side-by-side. This is achieved by adding a key attribute that identifies the originating Manager application.

An advantage to this approach is its low complexity; just replicating everything is an approach everybody understands. There is no impact on the Manager application code, and all participants remain completely decoupled from each other. Additionally, the mechanism does not rely on any built-in replication mechanism from the middleware. This could be considered an advantage as well, because the developer now has every aspect of the replication process under control.

On the other hand, that latter argument could be seen as a disadvantage. Requiring all Observers of the state machine to be able to deal with multiple, identical versions simultaneously does have an impact on the application code and introduces an extra burden on the application developer. Also, the number of instances an instance state updates scales linearly with the number of Manager processes. For this particular state machine example that is not that big of a deal, but in the general case, it could have a significant impact on network bandwidth consumption and memory usage.


<< First < Previous Page 1 Page 2 Next > Last >>

Engineers' Choice Awards
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by Control Engineering subscribers. Vote now (if qualified)!
System Integrator Giants
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
System Integrator of the Year
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
How to Maximize Factory Automation Efficiency with Low Cost Machine Vision
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Wireless Reliability in Harsh Environments
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
Human Factors and the Impact on Plant Safety
This eGuide contains a series of articles and videos that considers theoretical and practical; immediate needs and a look into the future.
May 2018
Salary and Career Survey, IT and OT convergence, robotic standards and safety, secure circuit protection
April 2018
Cybersecurity best practices, artificial intelligence, robotic additive manufacturing, embedded systems, IIoT integration, energy efficiency
March 2018
Digitalization integration, process sensors, edge computing, fog computing, condition monitoring, and motors
Edge Computing
This article collection contains several articles on how today's technologies heap benefits onto an edge-computing architecture such as faster computing, better networking, more memory, smarter analytics, cloud-based intelligence, and lower costs.
IIoT: Machines, Equipment, & Asset Management
Articles in this digital report highlight technologies that enable Industrial Internet of Things, IIoT-related products and strategies.
PLCs
Programmable logic controllers (PLCs) represent the logic (decision) part of the control loop of sense, decide, and actuate. Featured articles in this digital report compare PLCs and programmable automation controllers (PACs), industrial PCs, and robotic controllers.
SIDB

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

April 2018
ROVs, rigs, and the real time; wellsite valve manifolds; AI on a chip; analytics use for pipelines
February 2018
Focus on power systems, process safety, electrical and power systems, edge computing in the oil & gas industry
December 2017
Product of the Year winners, Pattern recognition, Engineering analytics, Revitalize older pump installations
John O. Ayuk, PE, CFSE, PMP, CAP
Automation Engineer; Wood Group
Doug Baker
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
Data Centers: Impacts of Climate and Cooling Technology
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
Safety First: Arc Flash 101
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
Critical Power: Hospital Electrical Systems
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
Engineers' Choice Awards
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by Control Engineering subscribers. Vote now (if qualified)!
System Integrator Giants
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
System Integrator of the Year
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
How to Maximize Factory Automation Efficiency with Low Cost Machine Vision
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Wireless Reliability in Harsh Environments
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
Human Factors and the Impact on Plant Safety
This eGuide contains a series of articles and videos that considers theoretical and practical; immediate needs and a look into the future.
May 2018
Salary and Career Survey, IT and OT convergence, robotic standards and safety, secure circuit protection
April 2018
Cybersecurity best practices, artificial intelligence, robotic additive manufacturing, embedded systems, IIoT integration, energy efficiency
March 2018
Digitalization integration, process sensors, edge computing, fog computing, condition monitoring, and motors
Edge Computing
This article collection contains several articles on how today's technologies heap benefits onto an edge-computing architecture such as faster computing, better networking, more memory, smarter analytics, cloud-based intelligence, and lower costs.
IIoT: Machines, Equipment, & Asset Management
Articles in this digital report highlight technologies that enable Industrial Internet of Things, IIoT-related products and strategies.
PLCs
Programmable logic controllers (PLCs) represent the logic (decision) part of the control loop of sense, decide, and actuate. Featured articles in this digital report compare PLCs and programmable automation controllers (PACs), industrial PCs, and robotic controllers.
SIDB

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

April 2018
ROVs, rigs, and the real time; wellsite valve manifolds; AI on a chip; analytics use for pipelines
February 2018
Focus on power systems, process safety, electrical and power systems, edge computing in the oil & gas industry
December 2017
Product of the Year winners, Pattern recognition, Engineering analytics, Revitalize older pump installations
John O. Ayuk, PE, CFSE, PMP, CAP
Automation Engineer; Wood Group
Doug Baker
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
Data Centers: Impacts of Climate and Cooling Technology
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
Safety First: Arc Flash 101
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
Critical Power: Hospital Electrical Systems
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
Engineers' Choice Awards
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by Control Engineering subscribers. Vote now (if qualified)!
System Integrator Giants
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
System Integrator of the Year
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
How to Maximize Factory Automation Efficiency with Low Cost Machine Vision
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Wireless Reliability in Harsh Environments
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
Human Factors and the Impact on Plant Safety
This eGuide contains a series of articles and videos that considers theoretical and practical; immediate needs and a look into the future.
May 2018
Salary and Career Survey, IT and OT convergence, robotic standards and safety, secure circuit protection
April 2018
Cybersecurity best practices, artificial intelligence, robotic additive manufacturing, embedded systems, IIoT integration, energy efficiency
March 2018
Digitalization integration, process sensors, edge computing, fog computing, condition monitoring, and motors
Edge Computing
This article collection contains several articles on how today's technologies heap benefits onto an edge-computing architecture such as faster computing, better networking, more memory, smarter analytics, cloud-based intelligence, and lower costs.
IIoT: Machines, Equipment, & Asset Management
Articles in this digital report highlight technologies that enable Industrial Internet of Things, IIoT-related products and strategies.
PLCs
Programmable logic controllers (PLCs) represent the logic (decision) part of the control loop of sense, decide, and actuate. Featured articles in this digital report compare PLCs and programmable automation controllers (PACs), industrial PCs, and robotic controllers.
SIDB

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

April 2018
ROVs, rigs, and the real time; wellsite valve manifolds; AI on a chip; analytics use for pipelines
February 2018
Focus on power systems, process safety, electrical and power systems, edge computing in the oil & gas industry
December 2017
Product of the Year winners, Pattern recognition, Engineering analytics, Revitalize older pump installations
John O. Ayuk, PE, CFSE, PMP, CAP
Automation Engineer; Wood Group
Doug Baker
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
Data Centers: Impacts of Climate and Cooling Technology
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
Safety First: Arc Flash 101
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
Critical Power: Hospital Electrical Systems
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
click me