Evaluating a process safety system
What if your company recently purchased an existing process plant: How can you examine and evaluate existing process safety systems that are in place? Are the systems really capable of providing enough safety? Do they meet current standards? Is there a risk of noncompliance with OSHA or other regulations? These can be confusing and potentially complex questions, but there is an answer. This answer may bring not only improved safety, but potential cost savings as well.
Functional safety standards, beginning with ISA 84.01 in 1996, have described an engineering process for safety system analysis, design, and maintenance called the safety lifecycle (SLC). The most recent version is described in detail in IEC 61511:2003 (also known as ISA 84.00.01-2004). The guiding principles are:
- Risk analysis resulting in risk reduction requirements for a safety system
- Performance based design evaluation, and
- A maintenance process to maintain the safety system.
This performance-based approach does not provide a set of detailed design rules, but instead a framework to determine what level of safety is really needed and what level of safety has been achieved by any given design. This is exactly the framework needed to evaluate an existing process safety system.
The SLC consists of three phases: analysis, realization, and maintenance. In the analysis phase, shown in Figure 1, the need for safety equipment is determined. An owner/operator will analyze one or more process units to identify potentially dangerous conditions called hazards. As each hazard is identified, the consequences (How bad could it be?) and the likelihood (How often might it happen?) are estimated or analyzed. The combination of likelihood and consequences results in an estimate of risk for each hazard. This part of the process is called a process hazard analysis (PHA) and has been a common practice for several decades. The most common technique used for a PHA is a hazards and operation study called a HAZOP.
In a similar manner you would perform an analysis of cyber-security threats that must be addressed in your safety system design. ISA 99.00.01-2007 will give guidance to your team to define the risks, identify ways to address the risks inherent in your process plant’s design. You will design an implementation model (framework) using zones (groupings of equipment, network devices, processes, and data) and conduits (the methods used to communicate between zones).
Any well designed risk assessment methodology should include the following elements:
1. Determining the assets that need to be protected (people, processes, equipment, information, chemicals, etc.)
2. Determining the consequence of a compromise for each of the assets (loss of production, health/safety impact, environmental impact, etc.)
3. Determining the vulnerability of those assets, taking into account the anticipated safeguards
4. Determining the threats to those assets (theft, misuse, damage, system malfunction, etc.), and
5. Calculating the residual risk.
You will perform a high-level risk assessment using ISA 99.02.01-2009 looking at entry points via a conduit into a zone. More detailed analyses may be done to determine which zones may be attacked assuming that the first zone has been compromised. In addition you will consider possible compromise situations and envision mitigations or protections for those compromises.
With the information from the HAZOP, a list of hazards and the estimated risk for each hazard is compared to tolerable risk criteria established by the company. A risk reduction factor is the ratio of estimated inherent risk and the tolerable risk. If no risk reduction is required, no safety system protection equipment is needed. The protection equipment for a particular hazard is called a safety instrumented function (SIF). The results of the risk comparison show where each SIF is needed and how many SIFs are needed according to the latest tolerable risk criteria. This is documented in the safety requirement specification (SRS).
The next phase of the SLC, as shown in Figure 2, is when the conceptual and detailed design is done one SIF at a time. During conceptual design, the equipment for each SIF is chosen and justified. Any needed redundancy of equipment is planned, and the proof test methods and time intervals are established. Lastly, a performance analysis of the proposed design is done via calculations and checklist comparisons. If the conceptual design does not meet the requirements, modifications are made until the requirements are met. If there are several ways to meet requirements, the optimal way is chosen. Then the design team moves on to the next SIF. When all conceptual designs are finished, the detailed design work is completed and documented. Often a factory acceptance test is then done. Review the cyber security posture of the system given the proposed SIF.
The factory acceptance test should include a very detailed and complete cyber security penetration test. It is very difficult to test for cyber-security vulnerabilities on a live system. The factory acceptance test, and later, the site acceptance test are the ideal times to ensure that the complete system is cyber-secure in that all risks are either mitigated or at the level of risk that the system is designed to accommodate.
Installation and commissioning are done at the beginning of the third phase of the safety lifecycle, as shown in Figure 3. When all the commissioning tests are complete, there is an audit to verify that all safety documentation has been completed and that all SIFs are designed, tested, and commissioned according the original requirements. As part of this safety validation, all procedures and documentation needed for maintenance are done. A cyber-security site acceptance test should also be part of that process as cyber threats now represent safety system risk. Once the safety system is ready, the process may begin. During ongoing operation of the process, there is regular proof testing per the schedules established during the SIF performance evaluation. It is important to keep good records of the as-found conditions during each proof test and evaluate the results at regular intervals.
Using the SLC for existing processes
Given this brief introduction to the safety lifecycle, it may appear not useful for an existing process. One can certainly see that the IEC standards are written as if a new system is being designed and implemented. However, the principles and framework are very suitable for existing system evaluation.
The existing system evaluation begins with phase one: the analysis. Most existing plants already have a completed HAZOP, as it is required by regulation in many countries. If one does not exist or cannot be found, then the first step is to perform a process hazard analysis. If an existing HAZOP is found, the first step is to review it thoroughly to make certain it is up to date. The tolerable risk criteria must then be established if they do not already exist. Most companies have such criteria, perhaps embedded in a SIL selection procedure.
A SIL selection workshop can be held where a team from the plant will identify any necessary SIFs along with the risk reduction needed. This can be documented in a safety requirements specification. Then a comparison can be made of the SIFs needed per the latest risk criteria versus the actual SIFs installed. In some cases it has been discovered that unnecessary equipment is in place. That equipment could be left in place if the maintenance expense is small, or it could be removed especially if it has caused false trips. It may be recognized that some new SIFs or upgraded equipment need to be installed as well.
In phase two of the safety lifecycle for existing systems, the list of installed SIF equipment needed per the latest risk criteria is evaluated. The same performance evaluation done for new designs can be done on the existing equipment. There are three possible outcomes for each existing equipment SIF:
- The SIF does not meet safety criteria
- The SIF meets safety criteria, or
- The SIF exceeds safety criteria.
Outcome 1: SIFs that do not meet the requirements need to be improved. This can be as simple as more frequent proof testing. The evaluation can be done for different proof test intervals and different proof test techniques to find the optimal set of tests. If more frequent proof testing is not possible or practical, older instrumentation and control equipment can be replaced with up-to-date safety-certified equipment. All equipment should be justified by either documenting a prior use analysis or using safety-certified equipment.
Outcome 2: Existing SIFs that meet the requirements only need equipment justification. For all equipment that had been safety certified, make sure the certificates are properly documented. For all equipment not safety certified, document prior use analysis.
Outcome 3: SIFs that exceed requirements may allow consideration of a reduction in the proof test frequency. Changing to longer proof test intervals normally reduces both testing and maintenance costs. In studies conducted some years ago when the new framework was still in draft, a significant percentage of SIFs were "over-designed." Of course this depends on the design methods being used when the design was first done. But these studies indicated good potential for actually reducing ongoing maintenance costs while meeting tolerable risk criteria.
In phase three of the safety lifecycle, maintenance and repair of SIF equipment goes on much the same as before with the possible exception of better proof test procedures and perhaps different proof test intervals. Many companies that did not have a good record-keeping system for failure and proof test data implement one at this time. This is done to identify other opportunities for improvement. These data collection systems have certainly identified ways to improve operations, primarily in the reduction of false trips and improved maintenance procedures.
The framework described in functional safety standards working with the safety lifecycle make an excellent blueprint for evaluating existing systems designed long before the new standards were written, however there is work involved. Plant management may be expecting a quick review meeting for a day or so, and while that might be useful, there is no substitute for good engineering analysis: let the numbers answer the question. The effort can be significantly reduced by using automated engineering tools. Several vendors have phase one, phase two, and phase three safety lifecycle tools. These tools typically allow input during the HAZOP or importation of HAZOP results. From there, SIFs are automatically identified and evaluation calculations are quickly completed without the in-depth knowledge required to do these with a spreadsheet. A good tool-based approach following the framework of IEC 61511 is an excellent way to evaluate existing safety systems. The clean documentation provides not only a roadmap for future improvement but solid, defendable evidence of due diligence if ever audited by a regulatory agency.
William Goble, PhD, CSFE, is principle engineer for exida. Leigh Weber, CISSP, is a senior security engineer for exida.
1. ANSI/ISA-84.01-1996 (approved February 15, 1996)—Applications of Safety Instrumented Systems for the Process Industries. Research Triangle Park: ISA, 1996.
2. ANSI/ISA-84.00.01-2004, Parts 1-3 (IEC 61511-1-3 Mod)—Functional Safety: Safety Instrumented Systems for the Process Industry Sector. Research Triangle Park: ISA, 2004.
3. Safety Lifecycle Poster, exida.com, Sellersville, PA, 2010.
4. Hartmann, H., Scharpf, E., Thomas, H., Practical SIL Target Selection—Risk Analysis per the IEC 61511 Safety Lifecycle, exida.com, Sellersville, PA, 2012.
5. Goble, W., SIS Equipment: Selection and Justification, White Paper, www.exida.com, Sellersville, PA, 2012.
6. Goble, W., Understanding safety standards ‘Proven in Prior Use’ requirements, Control Engineering Europe, June/July 2008.