Can luck cover for a lack of alarm management?
In the industrial world, process alarm systems affect the bottom line. Well-functioning alarms can help a process run closer to its ideal operating point, enabling higher yields, lower production costs, and improved quality, all of which add up to greater profits.
At many other facilities, however, there is no formal strategy for managing alarm performance. These sites are confident their alarm systems are in order and assets are fully protected, yet they fail to recognize the potential for failures.
A lack of effective alarm management can result in billions of dollars lost every year to accidents, equipment damage, unplanned plant or unit outages, off-spec production, regulatory fines, and huge intangible costs related to environmental and safety infractions.
Today’s safety challenges
Employers, irrespective of the size or nature of their business, have responsibility for the day-to-day health, safety, and welfare of employees and visitors to the workplace. This duty of care is usually set out in the occupational health and safety (OH&S) legislation of the relevant country.
Companies, as well as individuals from the supervisor to CEO level, have been legally prosecuted for breaches in OH&S regulations. Duty of care typically mandates that employers in automated industries provide a suitable alarm system that gives adequate warning of impending abnormal situations to operators so they have time to take action to prevent upsets or incidents from occurring. Duty of care also includes the provision of an appropriate control system for manufacturing facilities.
In principle, the DCS alarm system is a vital, productive tool for managing industrial processes, and it can be configured to identify and notify personnel of a wide variety of abnormal conditions in plant operation. Alarms provide a unique layer of protection against scenarios impacting safety, the environment, or financial loss. They combine the flexibility and adaptability of the plant operator with the power of technology. However, in practice, poor initial design and lack of effective alarm management often result in alarm systems that are not "fit for purpose."
Why alarms deserve attention
At many industrial facilities, alarm systems do not receive the attention and resources they deserve. This is understandable, because alarming appears to be a deceptively simple activity. Facilities often retain the alarm design philosophy developed by the engineering firm at the time of their original construction.
Justifying the cost of a comprehensive alarm management program can be a difficult task. Operations and engineering people realize alarm system performance is a serious issue, but may have trouble convincing senior level plant management that the company should invest scarce resources in an advanced alarm technology.
Alarm management is one of those difficult areas where financial returns aren’t immediately apparent. The return is realized when properly designed alarms help the company avoid a production loss. It’s a concept often overlooked at the expense of other higher profile improvement programs. Why? Financial resources may be limited. On paper, process optimization and performance monitoring yield a better financial gain. There is also a common lack of understanding of what alarm management is.
At complex processing plants, there are many potential repercussions from disregarding alarm management. These can range from process upsets (downtime/loss of production) and plant shutdowns to loss of containment and catastrophic failure.
Recent industrial disasters
Abnormal situations cost industry billions of dollars every year. A number of plant incidents partly attributed to alarm management issues have tragically resulted in injury and death of personnel and huge financial losses.
For example, during the 2005 explosion at BP’s Texas City, Texas, refinery, key level alarms failed to notify operators of the unsafe and abnormal conditions that existed within the tower and blowdown drum. The resulting explosion and fire killed 15 people and injured 170 more.
The tank overflow and resultant fire at the Buncefield oil depot in the UK caused a £1 billion (1.6 billion USD) loss. The incident could have been prevented if the tank’s high-level safety switch, per design, had notified the operator of the unsafe tank condition or had automatically shut off the incoming flow.
At the Bayer facility in Institute, W.Va., improper procedures, worker fatigue, and lack of operator training on a new control system led to a residue treater overcharging with Methomyl-resulting in an explosion and chemical release.
Applicable industry standards
Several institutions and societies have produced standards on alarm management to assist in the best practice use of alarms in industrial manufacturing systems. Among them are the UK-based Engineering Equipment and Materials Users Association (EEMUA), and the U.S.-based American National Standards Institute (ANSI), International Society of Automation (ISA) and American Petroleum Institute (API).
EEMUA Publication 191 ("Alarm Systems-A Guide to Design, Management, and Procurement") was first released in 1999 and is acknowledged as the de facto industry standard for alarm management. (The second and third editions were released in 2007 and 2013.) This standard provides a detailed description of the tools and techniques for various aspects of alarm management (e.g., rationalization, risk assessments, and graphics design).
ISA and ANSI approved ANSI/ISA-18.2-2009 ("Management of Alarm Systems for the Process Industries") in June 2009 to specify an overall lifecycle approach to alarm management. ISA-18.2 has many similarities to the Safety Instrumented System (SIS) standard IEC 61508/11.
Both of these publications have similar key performance indicators (KPIs) for alarm system performance. So how can process plants ensure their compliance with the standards and avoid the likelihood of alarm-related failures or incidents?
Assessing potential risks
The first step in addressing a lack of alarm management is to understand the relevant issues and acknowledge where problems exist. This requires a thorough assessment of alarm performance, which can help determine alarm requirements to minimize risk potential.
In alarm system assessments, a tool known as the "Swiss Cheese" model of accident causation is commonly used for risk analysis and management. Originally developed by Dante Orlandella and James T. Reason of the University of Manchester, it is sometimes called the cumulative act effect.
With the Swiss Cheese model, an organization’s defenses against failure are modeled as a series of barriers, depicted as slices of cheese. The holes in the slices represent weaknesses in individual parts of the system and continually vary in size and position across the slices. The system produces failures when a hole in each slice momentarily aligns, so that a hazard passes through holes in all of the slices, leading to a failure (see Figure 1).
This model includes both active and latent failures. Active failures encompass unsafe acts directly linked to an accident, such as (in the case of plant accidents) operator error. Latent failures include contributing factors that may lie dormant for days, weeks, or months until they contribute to the accident.
Here is an example of a true incident analyzed with the Swiss Cheese risk model:
1) Plant operation is relatively unstable toward the end of a 12-hour shift (operational factor)
2) Tank containing hot material reaches high-high level (process factor)
3) High-high level DCS pump interlock was disabled to replace an instrument, but had not been re-enabled (management of change factor)
4) Control room operators miss the alarm because they are overloaded and distracted by an alarm flood (alarm management factor)
5) Safety level switches in the safety integrity level (SIL) loop for tripping the incoming pump power supply have not been tested for over two years and fail to operate (maintenance factor), and
6) Tank overflows with workers in close vicinity (incident result).
Keys to better performance
Based on real-world experience across many process industries, it is obvious that the lack of an effective alarm management strategy has a direct negative impact on plant operations, performance, profitability, and safety.
Quite simply, some plants do not take alarm management seriously. It is not unusual for facilities to address the performance of process alarms and then forget about them. This is foolhardy, since plant processes are dynamic and alarm conditions are constantly changing.
All too often, the ownership of an alarm management program resides with the control system department, and not with the operations manager where it belongs. This is because alarms that are flooding or not annunciating correctly are typically viewed as a control or instrumentation problem.
Operations personnel need to realize the process control system belongs to them and how it functions is determined by their requirements. The DCS group can make required changes to the alarm system, but it must be driven by operations. The alarm is a tool used by the operator; thus, it is in the operator’s best interest for this tool to function correctly and meet the operator’s specifications.
Alarm management is a comprehensive process by which alarms are engineered, monitored, and managed to ensure safe, reliable operations. At the heart of this process is the concept of layers of protection, which provides independent layers of protection around hazardous processes to reduce the risk of undesired consequences such as fire, toxic releases, and so on. Alarms are considered to be a layer of protection (LOP) and are often used in SIL analysis (see Figure 2).
Education is the best remedy for improving-and maintaining-alarm system performance. Personnel across all areas of plant operation, including control room operators, field operators, process engineers, and instrument technicians, should be instructed in proper alarm management and then buy into the program. This is a proactive approach to alarms.
Some automation system suppliers conduct workshops at customer sites to help make alarm management efforts more fruitful. This training can begin with a general orientation for all plant stakeholders, followed by specific instruction according to job function using approved alarm philosophy documents. The workshop is a valuable tool for helping workers understand how they are expected to engineer, manage, and maintain their alarm system (see Figure 3).
Alarm management is imperative to assessing, improving, and optimizing process alarms, thereby increasing the effectiveness of the plant. Without an effective alarm program in place, nuisance alarms, alarm floods, and improperly prioritized alarms can lead to operator confusion, and thus increase the risk of accidents.
However, it is important to remember that alarm management is not a one-time project; it is a redesign/reengineering and a lifecycle process. All new alarms are designed on how they fit into the process and the benefit they give the operator. Therefore, the performance of the alarm system is continuously being improved and optimized.
Tyron Vardy is an alarm management consultant for Honeywell Process Solutions.
- An effective alarm management program is critical to a plant that operates safely and effectively.
- The fact that a specific plant or process unit has not had an accident in some period of time does not indicate an effective program.
- An effective program requires careful planning and ongoing evaluation following appropriate standards.
For more information, visit:
Read more about alarm management below.
1. EEMUA Publication 191, "ALARM SYSTEMS – A Guide to Design, Management, and Procurement"
2. ANSI/ISA-18.2-2009, "Management of Alarm Systems for the Process Industries"
3. Honeywell Process Solutions (2011), "Alarm Management Standards – Are You Taking Them Seriously?"
4. Honeywell Process Solutions (2013), "A Guide to Effective Alarm Management."