A rational approach to alarm rationalization

While it may not be your favorite activity, thoughtful alarm rationalization pays major operational dividends in the long run and will keep your operators happier.

By John E. Bogdan, Susan F. Booth April 3, 2012

Alarm rationalization. Just the phrase is enough to cause managers to groan and potential rationalization team members to run for cover. Their reaction is not without merit. A typical alarm system is a morass of poorly thought-out alarms with little or no documentation, and the task of bringing such a system into alignment with a plant or pipeline’s operating philosophy is a daunting one.

Why rationalize?

Most would agree that current alarm systems are broken, badly broken. There are far too many alarms in a typical system. Alarms are often in place simply because they came configured when the control system was installed. Some even came with arbitrary setpoints already designed. Does 90, 80, 20, 10 sound familiar? Setpoints, in general, are not related to realistic conditions. Priority has not been determined through a systematic analysis of consequences and time to respond. In most operations, about 80% of all alarms have been prioritized as high priority; clearly the alarm system is not close to the recommended alarm priority distribution. To top it all off, documentation is scarce and spotty.

For many, pressure to meet regulatory requirements is the driving force behind rationalization and the rest of alarm system redesign. For those who are not (yet) under the gun of new regulations, optimizing the alarm system will improve operator effectiveness and yield significant improvements in safety and productivity. Weighed against the cost of a potential incident, the cost of this effort can be readily justified.

What is rationalization? 

Alarm rationalization, also called documentation and rationalization (D&R), is the procedure used to determine the optimum alarm set to be included in an alarm system. This is the set that will consistently deliver the right alarm to the right operator at the right time with the right importance and the right information to correct or mitigate the undesirable situation. During rationalization, a multidisciplinary team reviews and evaluates the operation and decides what possible undesirable circumstances could arise that would justify an alarm according to the criteria set forth in the alarm philosophy document (APD). The team also performs the preliminary design of each alarm, including the priority, setpoint, and other alarm attributes. They document all this information in a master alarm database (MADb).

(For more details on the basics of rationalization, please refer to Managing Alarms Using Rationalization, Control Engineering, March 10, 2011.)

Pitfalls of the common approach to rationalization

Rationalization is time-intensive and requires significant personnel resources, so attacking it with an effective strategy is imperative. A common approach is to start from the existing MADb and review every possible alarm made available by the control system, configured or not. Candidate alarms that meet the APD criteria are included in the optimized alarm system. This approach has the seeming advantage of giving the team a framework from which to start, but it has several drawbacks.

1. It is time-consuming. ISA suggests in its Alarm Management Class IC39C that 100 to 200 alarms per day is a good pace for rationalization, and 300 to 400 alarms per day are possible with good pre-work. Therefore, rationalization of a small to medium system containing about 10,000 alarms would require a minimum of 25 days. With ineffective techniques or staffing, rationalization has been known to drag on for months.

2. It is mind-numbing. The common practice has the team cloistered in a room staring at the existing MADb. Point-by-point (or group of points by group of points), the team runs through the same set of questions for each candidate alarm to determine if it is to be included in the new system. The sheer amount of garbage to cull through and the focus on details instead of the larger picture results in boredom, inattentiveness, and an occasional nap. 

3. Most importantly, the result is often not the optimum system. It is possible, even likely, to miss necessary alarms. An inherent problem in a review process is that it is easy to overlook something that should be part of the system but was not part of the original design. If it is not there, it is not reviewed. Another drawback is that it can be tempting to accept original choices rather than take the time to evaluate whether better options might be available. Yet another is that elimination of unnecessary alarms may not be as thorough as possible. It can be tempting to retain alarms that are questionable rather than research more completely to be sure. The response, “It’s only one alarm,” happens more than once and can add up!

The review approach can work if the team keeps its focus on identifying potential undesirable situations, rather than on checking off alarms. However, the massive MADb is the framework, and we know it represents a broken system. Why start building a critical system from such a faulty foundation?

A more rational approach

An alternative to the common approach is one similar to that required for a new plant or operation in which no alarm system exists to be reviewed. This “clean slate” approach focuses on identifying undesirable situations (which is what we are really interested in), determining the best ways to detect them, and designing alarms to do the job. A general description of the procedure is:

Step 1. Divide the process into small, manageable units.
Step 2. Identify common or similar elements.
Step 3. For each unit, or group of common elements:
a. Identify events with undesirable or negative consequences.
b. Determine the best way to detect these.
c. Design the alarms.
d. Examine the interconnections between units to see if these boundaries introduce any events with undesirable consequences. If so, determine the best way to detect them and design the alarms for them.

The result upon completing these three steps for the entire process is the preliminary set of alarms. To this must be added any required alarms, those alarms required by an external agency (e.g., legal requirement, environmental permit, warranty), following procedures outlined in the APD.

Step 4. Check the preliminary set of alarms against the existing MADb. There are four cases to be considered, the first three of which are readily resolved.

Table 1 – Comparison of new set of alarms vs. existing set in MADb

In Case 4, it is likely that the candidate alarm was unnecessary and should have been eliminated. However, it is possible that an undesirable event or required alarm was missed in the new rationalization. If so, the team should design the appropriate alarm and include it in the system, or add the required alarm.

The result upon completion of this procedure is the optimized set of alarms for the operation or process.

Example of rationalizing from a clean slate

The following example is deliberately simplified to illustrate the concepts. Figure 1 shows a water treatment and distribution process that treats raw water, further conditions it as necessary, and then distributes it to a number of customers.

Figure 1 – Overview of water treatment and distribution

This process is too large to be considered in its entirety and should be broken into manageable units. Most processes can be divided logically into workable units.

Following Step 1 in the rationalization procedure described above, three units were broken out from the example process and are detailed in Figure 2.

Figure 2 – Detail of heating, water treatment, and vent header units

This example will focus on the heating unit and its connections to other units.

The heating unit takes treated water, heats it in a batch cycle, and then transfers it to a customer. The automated heating cycle is described in Table 2.

Table 2 – Description of batch cycle for heating sub-unit

In the existing design, the tank pressure and level are measured by PT001 and LT002 respectively. The water temperature is measured in the pump suction line by TT003.

Step 2 is to identify common or similar elements in the process. For example, just in Figure 2, there are multiple tanks and pumps, all in similar service. Therefore, you can expect them to be subject to similar events that might result in similar undesirable consequences. Using this commonality and applying the same line of reasoning to many elements at once can allow you to save significant time and effort.

Step 3 is to identify events in the heating unit with undesirable or negative consequences and design alarms to detect them. For this example, we will discuss two events:

• Loss of containment. The discussion reveals that there are two causes for this. They will be rationalized separately.
• Pump damage due to cavitation.

Tables 3 and 4 summarize the rationalization of the alarms associated with these two events.

Table 3 – Loss of containment rationalization

Table 4 – Pump damage rationalization

Next, step 3d requires examining the boundaries between units. This results in two more events that need to be discussed:

1. Product does not meet contractual temperature requirements. Discussion reveals that there are two causes for this. They will be rationalized separately.
2. Failure to deliver water on time.

Tables 5 and 6 summarize the rationalization of the alarms associated with these two events.

Table 5 – Out-of-spec product rationalization

Table 6 – Failure to deliver product rationalization

The results of the above rationalization discussions are summarized in Table 7.

Table 7 – Summary of rationalization results

Table 8 is a comparison of the existing alarm system and the results of the rationalization discussion.

Table 8 – Comparison of existing alarm configuration to rationalization results

There are significant differences between the existing alarm system and the rationalization results:

1. The high-high absolute alarms for PT001 and LT002 in the existing alarm system should be eliminated because they duplicate the respective high alarms.
2. The existing alarm system did not include an alarm on high pressure or level in standby mode, but should have.
3. The high absolute alarms for PT001 and LT002 were removed from the batch logic and placed in the continuous logic.
4. The low absolute alarm for LT002 in the existing alarm system, presumably for pump protection, should be eliminated and replaced by a new low absolute alarm on NPSH.
5. The high-high absolute alarm for TT03 in the existing alarm system should be eliminated because it is a duplicate of the high alarm.
6. The high absolute alarm for TT003 in the fill step in the existing alarm system should be eliminated because there is no required action and it is unnecessary.
7. The high and low absolute alarms for TT03 in the heat and drain steps were modified to be suppressed if the pump P1 was not running.
8. The existing alarm system did not include the NPSH or time-in-step alarms, but should have.

Conclusion

It may appear that this clean-slate approach to rationalization will be even more time-consuming than the commonly used review process since it also includes a check using the old alarm system. However, our experience shows that this approach actually ends up greatly reducing the time required. We have been able to rationalize an average of more than 500 alarms per day.

As in the review approach, significant gain is achieved by identifying similar elements and capitalizing on copying alarm design. We have found that it is easier to identify common elements when focusing on the big picture rather than the details, magnifying this gain. The largest time-saving, however, is achieved by avoiding dealing with thousands of candidate alarms that never should have been alarms in the first place. Comparing the results of the clean-slate rationalization against the old MADb is much faster than tackling thousands of poorly designed alarms. The wheat has already been separated; all that is left is to blow the chaff away. Not having to wade through mountains of chaff also reduces the mind-numbing aspects of rationalization. The team is more likely to be actively engaged in the process, which results in a more thoughtful analysis. The clean-slate approach can also reduce manpower demands because the work can be more easily divided, and therefore, meeting time can be reduced.

Most importantly, this approach is more likely to result in the optimum design for your alarm system. Why not take a rational approach to rationalization?

John Bogdan is principal consultant for J Bogdan Consulting. Reach him at john.bogdan@jbogdanconsulting.com. Susan Booth is a consultant and technical writer for J Bogdan Consulting. Reach her at susan.booth@jbogdanconsulting.com.

Additional reading:

www.jbogdanconsulting.com

Managing Alarms Using Rationalization, Control Engineering, March 10, 2011, https://bit.ly/i1ytyF

ISA Alarm Management Class IC39C, www.isa.org

Pump School – Net Positive Suction Head, https://www.pumpschool.com/applications/NPSH.pdf