Implementing alarm management per the ANSI/ISA-18.2 standard
In process industries, alarm systems are used to notify operators and other plant personnel of abnormal process conditions or equipment malfunctions. Alarm systems help operators operate the process safely under both normal and abnormal conditions, and the alarm system needs to be designed correctly to provide the best opportunity for safe and efficient operation.
Before the wide adoption of distributed control systems (DCSs) and other PC-based human machine interfaces (HMIs), visual and audible indications of process plant operations were normally provided by a panel board, with the number of alarms restricted because of space limitations. In addition, alarm points had to be selected with care, because these points were hardwired and expensive to change.
But with a modern automation system, the number of alarms is virtually unlimited, as additions and changes are made simply by reconfiguring software. This ease-of-use provides the opportunity to improve alarm systems, but can also make alarm management more challenging.
In particular, there is a temptation to alarm every possible deviation, even when the deviation doesn’t present a problem requiring immediate attention. In the event of a serious incident, this practice can generate a huge number of alarms simultaneously, commonly referred to as alarm flooding. When this occurs, operators may not be able to ascertain and act on the important alarm(s), causing the incident to escalate in terms of severity.
In the worst case, alarm flooding can cause serious environmental damage, production loss, injury, or even death to plant personnel. Proper management of alarm systems is essential to deal with alarm flooding and other related issues.
Poor alarm management can lead to serious consequences in process plants, as noted in the book “Alarm Management for Process Control” by Douglas H. Rothenberg, and by others in various documents and publications.
For example, poor alarm management caused one incident that resulted in $80 million damage and injured 26 people. Another process plant incident resulted in 15 deaths, 170 injuries, and significant economic losses. To avoid these types of incidents, proper alarm management is essential.
To improve alarm management, the International Society for Automation (ISA) issued standard ANSI/ISA-18.2-2009, “Management of Alarm Systems for Process Industries.” When issuing this standard, ISA considered other existing documents including the Engineering Equipment and Materials Users’ Association (EEMUA) standard 191 “Alarm Systems: A Guide to Design, Management and Procurement.” The International Electrotechnical Commission (IEC) is using ISA-18.2 as the basis for international alarm management standard IEC-62682.
This article gives an overview of ISA18.2, and shows how it can be used to improve new and existing alarm systems in process plants.
Role of alarms
ISA-18.2 defines an alarm as “An audible and/or visible means of indicating to the operator an equipment malfunction, process deviation, or abnormal condition requiring a response.” This means an alarm is more than a message or an event, as it indicates a condition demanding quick operator action.
Ideally, each alarm will provide the operator with related information such as priority, possible root cause, and a recommend response procedure. The operator can then respond to the alarm quickly and effectively. Limiting alarms, prioritizing alarms, and providing alarms with necessary related information can reduce the chance that an operator will delay response to an alarm, or even ignore the alarm.
What is alarm management?
Alarm management is the proper implementation of documentation, design, usage, and maintenance procedures to construct an effective alarm system. ISA18.2 defines the processes and procedures required to create an effective alarm management system. Figure 1 shows the ISA18.2 lifecycle model of alarm management. This model can be applied to a new or an existing alarm system.
As shown in Figure 2, stage activities logically follow one another, and correct completion of all activities will result in a properly designed and effectively operating alarm management system. The lifecycle model also includes stages for ongoing maintenance of the system, essential for sustaining effective operation.
The 10 stages in the lifecycle model can be roughly categorized into four general tasks. To perform these tasks, it’s essential that a process plant create a cross-functional team that includes all relevant plant functional areas including, but not limited to, management, engineering, safety, operations, and maintenance.
Task 1: Optimizing system design
This task encompasses lifecycle model stages A through E: philosophy, identification, rationalization, detailed design, and implementation. When properly executed, this task supports the design of an alarm system that prevents alarm flooding and other undesirable alarm system occurrences. It also provides operators with the information they need to take proper action when alarms occur.
An important activity within this task is to specify the causes of current nuisance alarms and to eliminate these alarms, or at least greatly reduce their frequency. This is an essential step toward reducing alarm flooding. In many cases, alarms can be reclassified as events to be recorded by the automation system for later review, instead of as items requiring immediate operator attention.
Once the total number of alarms has been reduced as much as possible, the next step is to prioritize the remaining alarms. Prioritization can be quite complex, as it requires plant personnel to identify possible abnormal operating conditions, list the alarms that might occur for each condition, and then prioritize these alarms. After alarms are reduced and prioritized, then recommended operator actions for each alarm can be created.
Performing these and other steps as listed in the lifecycle model stages A through E will result in the creation of an effective alarm system.
Task 2: Advanced support to operators
This task encompasses lifecycle model stages F and G: operation and maintenance. An effective tool in implementing this task is the creation of alarm summary windows in the automation system (see Figure 3). Modern automation systems will include the functionality required to create these windows in the process and its alarms.
An alarm summary window typically displays the list of currently active alarms. With most automation systems, the alarm summary window will provide sort, filter, shelving, and other functions to help improve display of information to the operators. These functions can be used to prevent higher priority alarms from being overlooked by operators.
Each alarm will require some type of response from the operator. Advanced alarm summary windows display what sequence of actions should be performed by the operator in response to particular alarms (see Figure 4). An effective method for displaying these actions is a flow chart, which can be used to guide the operator through the response sequence.
A flow chart can be a very effective tool because it can contain if/then instructions, guiding operators to take different actions depending on how the process responds to operator actions and other conditions.
Task 3: Performance evaluation
This task is analogous to lifecycle model stage H, monitoring and assessment. The purpose of this task is to evaluate the performance of the existing alarm system. In ISA18.2, key performance indicators (KPIs) are suggested as a useful tool to perform the activities in this stage.
An example KPI would be the number of alarms within a fixed time of the operation. As shown in Figure 5, ISA 18.2 lists the very likely to be acceptable and maximum manageable number of alarms for various time periods.
The alarm system will provide a host of data that can be used to evaluate performance including, but not limited to, alarm frequency, operator response time, and specific operation actions. This data can be used to improve the alarm system, and to provide more effective operator training.
It is often useful to evaluate alarm system data from several viewpoints. For instance, the number of alarms in each area and the number of hourly alarms are both important data points, and can be evaluated separately or together. By using these kinds of data, the conditions in which operator errors frequently occur can be specified, and these results can be used to improve operator response.
Modern automation systems provide tools for creating reports, and these reports can be particularly useful for evaluating alarm system performance (see Figure 6). Modern automation systems can be configured to collect a host of data concerning alarm system performance. This data can be presented to plant personnel in a variety of formats from simple KPIs to charts and graphs. Using this information, alarm system performance can be evaluated and improved.
Task 4: Continuous improvement
This task encompasses lifecycle model stages I and J, management of change and audit. Continuous alarm system improvement is supported by performing uniform management of the enormous amount of alarm-related data typically contained in an alarm master database.
For example, alarm system design parameters can be compared with actual alarm system performance figures stored in the database. When significant discrepancies exist between design parameters and actual data, then corrective action can be recommended. Recommended corrective actions can then be reviewed and implemented as required using a comprehensive change management procedure.
Alarm management entry points
For most alarm systems, there are three typical entry or starting points for creating an alarm management system. Referring to the ISA 18.2 lifecycle model, these points are: a) philosophy, h) monitoring and assessment, or j) audit.
For new process plants, philosophy is the preferred point of entry. For existing plants, either monitoring and assessment or audit is preferred. For the four tasks listed above, task 1 is the preferred point of entry for new process plants, and task 4 is the preferred point of entry for existing plants.
For existing plants, the performance of the existing alarm system is evaluated using recent plant operating data. Actions are then taken based on the evaluation. This course of action allows effective existing practices and procedures to remain in place, while pinpointing areas that require improvement.
For new plants, it is recommended that the lifecycle model be followed in its entirety, starting with task 1. This will ensure that all necessary steps are taken to implement an effective alarm management system.
Proper alarm management is indispensable for achieving safe and secure process plant operation. The approach to alarm management standardized by ISA18.2 was introduced and explained in this article, and then summarized into four general tasks. Following this approach will result in an optimal alarm management system that prevents minor alarms and upsets from escalating into serious incidents.
Marcus Tennant is principal systems architect for Yokogawa.
- A plant with poorly designed alarms can create confused operator responses resulting in many types of hazardous situations.
- Available standards offer a systematic approach to analysis and design of alarm management programs.
- With proper design, an alarm system can deliver useful plant performance data and guide improvements.
A search on “alarm management” at www.controleng.com yielded more than 2,300 results.