Building a maintenance management program for valves
Diagnostics from smart valve actuators using HART communication can save maintenance costs and improve plant reliability when used in a comprehensive maintenance management program. This is often the first big victory of a program.
While most smart devices are field instruments, smart valve actuators have the same capabilities, and often users launch a maintenance program for valves ahead of instruments. HART Plant of the Year recipients bear this out: Dow, Monsanto, and MOL all used HART diagnostics to improve plant reliability and save money on valve maintenance. Valves are far more maintenance intensive than instrumentation and have more wear-prone moving parts than a typical flowmeter or pressure sensor.
The ability of a given valve to function can have a huge impact on a process unit’s operation, depending on where it’s located. Its ability to move as expected and control as needed makes all the difference, so diagnostic information is a huge advantage for reliability. HART Communication can also supply real-time valve position feedback to operators, confirming that a valve is in the position that the control system says it is.
When a valve maintenance program in a large facility is thought out well, implemented with care, and with the right individuals at the helm, a full-time valve engineer can easily be worth $1 million or more when the costs of repair and replacement are combined with improved operations and plant availability.
Unless your plant has an unlimited budget, you will likely find yourself constrained as to how many valves can have smart actuators or positioners installed. So which valves are the most important? Criticality has two main elements: the importance that a given valve has in a unit’s operation, and the likelihood that that it will malfunction.
If you look at the P&ID for a process unit, you can identify certain valves that have to operate 24/7 for the unit to produce, and it may be a large number. If one of those valves fails, the unit goes down, or if it doesn’t perform correctly, product goes out of spec. Plant designers often install backup units in parallel on equipment like pumps so either can do the job, but this is not a common practice with valves. Those critical valves working without backup deserve special attention because you can’t reroute the flow and you can’t do without them.
The same thought applies when you have to perform valve maintenance. Can that valve be repaired while the unit is still running. If everything has to be shut down to take it out or fix it in place, that valve is also critical.
Probability of failure
There are valves that can be installed in a process unit and operate for years with no trouble. Where pressure drop isn’t excessive and the product is benign, there is little wear and tear in daily operation. But this isn’t always the case. Products can be corrosive, erosive, or otherwise aggressive, and there is nothing you can do beyond finding the best possible valve design and most durable materials.
Difficult service valves are also prime candidates to be retrofitted to become smart because they are subject to the kind of abuse that shortens lives. A worn stem or one where sticky product has accumulated can be difficult to move, and this is a simple thing for the smart actuator to measure and monitor over time. The system can warn operators of changes while there is still time to do something about it.
If your plant is one that runs continuously and can do so for years at a time, sooner or later it will have to be shut down for some period of time to fix accumulating problems. Generally this turnaround period has to be as short as possible, so there is much planning in the months and even years leading up to it. Everything has to be staged and ready to follow the repair and refurbishment schedule. That list will include valves since many will have to be repaired or replaced during the outage.
Turnaround planners must know exactly which valves need attention, and what kind of attention so appropriate units and parts can be ordered in advance. Valve diagnostics can help identify those valves and determine what needs to be done. When the turnaround clock is ticking, that is not the time to be discovering valves that need more attention than realized, or some that are scheduled for replacement are actually performing just fine.
In the repair shop
Valves that have been repaired, either in-place or in the valve shop, should be tested before returning to service. A smart actuator / positioner can measure and record the valve movement and make sure it is operating within acceptable parameters. This requires an appropriate testing procedure for each type of valve with a database where historical information can be stored.
That historical information is what helps identify developing problems. When the amount of torque required to turn a stem increases slowly over time or takes a sudden jump, trouble is not far away. Keeping that information where it can be accessed easily is a critical element of a larger maintenance program.
Establishing a program
Automated diagnostic capabilities for valves and other field devices available through HART communication can provide huge benefits at many levels. Effective diagnostic tools can make these activities quick and easy to perform from a central control facility or even a remote location. The concept of taking the data to an expert instead of taking the expert to the problem is a powerful and underutilized concept, but depends on proper implementation of a good set of diagnostic tools and effective work processes.
Overview of barriers
Given the large incentives and low capital cost for asset management, it would seem that this technology would be widely used. The reality is far different. Let’s take a look at those barriers and suggest some possible resolutions.
The first impediment has been the technology barrier. There is no question that device diagnostic technologies like HART work, but the tools available to put that information to work for asset management have traditionally been incomplete, hard to use, and poorly integrated. The good news is that the tools are improving and many users have demonstrated effective ways to use them. The bad news is that effective use of the tools and the engineering skills necessary for effective deployment are rare.
The second problem for asset management is management. Technical problems have solutions given management support, but managers have to be rewarded for doing so. That kind of program requires good metrics, good reporting, and effective audits to produce a scorecard. In the absence of a good management system, poor performance is often treated as a technology problem, or just the normal state of things.
Who’s supposed to be acting on the diagnostic information? Maintenance? Can you tell a maintenance guy to look at an incomprehensible mass of unprioritized diagnostic data and decide what he should do based on that? Properly implemented systems can provide order and priority to make the data useful to engineering and maintenance organizations.
Implementing an asset management program is a project in itself. It can be a stand-alone project for an existing facility, or an add-on to new construction or control system modernization. For an existing facility, a lot of stranded (unusable) diagnostics often exist in HART-enabled smart field devices that are not connected to a smart system. The cost of system tool implementation in an existing facility can be an issue, but it is manageable. For new construction, the added cost of asset management tools is almost negligible. But this is just the hardware side. Either way, the work processes, training, metrics, and management processes have to be implemented just the same.
One of the first activities in an asset management program is creating a criticality ranking for each piece of major equipment and each device. This is often a painful process, because people want to rank everything critical because otherwise it won’t get fixed. You need to know the impact severity and the likelihood of a problem to do a good criticality ranking.
You will use this criticality information throughout the design process, while implementing other maintenance activities, and planning. But many people wait until the system is built and operating, when it is much more difficult. It has to be done during the design phase if you want an effective project.
If you begin early enough and do the work systematically, the plant construction and start-up process will go faster and much smoother. During the design and factory acceptance testing, you need to do the building, create your tools, and train your people. You want to use all those diagnostics through those phases, during installation, commissioning, and loop checking. Typically, the system will pay for itself right there. You’ve covered the investment by the time you get the plant started up. History says that in most plants where we’ve done that, we get the system de-bugged, start up the plant, back-check and verify everything, correct all the mistakes, and then we turn the asset management system off and never look at it again. When that happens, the facility owner misses out on the big payback during plant operation.
Traditional maintenance technology for managing maintenance priority is a reliability matrix. We’ve done the risk assessment, and we’ve determined that this device is medium-to-low priority. We look at the list of what we have to do, and how many critical things are on the plate today, and all of the low priority stuff gets deferred, sometimes forever. It may not even get addressed during turnarounds because of budget. So the low priority stuff accumulates failures. That’s fine for a while but enough low priority failures can cause a larger-scale system failure, because the operators can’t tell what’s going on. There aren’t enough measurements. There aren’t enough controls. You can’t run the plant. System failures have greater impact than low priority device failures, but treating devices individually can lead to system effects that are not modeled or managed by the simple decision matrix.
Often after a major failure or an operational disaster, an investigation discovers that there were many signs of the growing problem, but nobody was able to see them or correctly interpret what they were seeing. Field device diagnostics were trying to warn of a growing problem, but nobody was able to connect the dots. Often it’s an accumulation of small things adding up until they reach a critical mass. An accumulation of small (low priority) problems is common today among operating companies. You’ll see this accumulation of problems if you read analysis reports after a catastrophe. After a while, enough little things line up in series and become a big thing. If you line up all the holes in a Swiss cheese, there’s a hole all the way through it. It’s another management failure.
Getting data to the right place
Once an appropriate data collection system is in place and you are working on setting up your work processes, you need to determine where the data stream goes. Should it be maintenance or operators? This isn’t a difficult process if you follow some simple principles: Send alerts to operators as well as maintenance if immediate operator action is required. The alert philosophy for operators is you’re dealing with individual events as they come up. You want a limited number of alerts that the operator can take some unique action on in real time.
On the maintenance side, you don’t want to deal with individual events. You want to log every little thing that happens, and then you want to use reports to sort through all of those logged events and make some sense out of them. You’re looking at history and analyzing what’s happened once, what’s happened a bunch of times, how high the priority was, and whether it happened to multiple devices. You can see trends from reports that get lost if you’re looking at individual events. A single problem with your air system can cause hundreds of events per day. You need a reporting format that can bring all that together and identify a common source. Clearly, an effective asset management program using diagnostics from smart devices can pay major dividends.
When is it baked?
It is really quite easy to tell. If you routinely use diagnostics to find out how something failed after the plant has had an unexpected shutdown, you haven’t finished the job. If you have a large database of saves where diagnostics were used to prevent an unplanned plant shutdown, you are on the right track.
ISA108: Intelligent device management-getting the most from HART enabled and other smart field devices
Reports from companies that have created effective asset management programs suggest that changing individuals’ thinking is more difficult than the technology of collecting diagnostic data. There is no question that HART works as advertised, but users find it difficult to bridge the gap between diagnostic information and effective asset management. Putting intelligent devices to work effectively is what ISA108 is about. As the organization characterizes it, "The purpose of ISA108 is to define standard templates of best practices and work processes for implementation and use of diagnostic and other information provided by intelligent field devices in the process industries."
ISA108 is not a technical standard in that it does not discuss how the diagnostic information is transmitted. It applies to HART as well as Foundation fieldbus, Profibus PA, and other protocols. The best practices are being created now, and there are opportunities for you to participate. Herman Storey is a co-chair, and he welcomes involvement from end-user companies. The ISA website explains how you can join the effort.
Using HART in three time domains:
Periodic testing—HART helps automate and document scheduled tests and calibrations, particularly collection and analysis of valve signature data, and PSTs (partial-stroke tests) for safety-related valves.
Incipient failures—HART diagnostics can alert users at any time to valve problems that are developing, but have not yet become failures. This allows maintenance to react to the problem before it causes and outage. This capability depends on having a process for monitoring diagnostic information, detecting changes, and reporting those to the right people.
Real-time performance—Operators place a high value on real-time valve position feedback from positioners via HART. Getting real-time smart device information quickly enough to be useful is the goal and requires the system to have good integration and access to intelligent device information.
Herman Storey is chief technology officer of Herman Storey Consulting, and a frequent presenter on asset management programs and other topics. Reach him at email@example.com.
– See related articles below.