Assisted living for your aging DCS
A process control system is supposed to help a plant or process unit operate with a minimum of human intervention. After all, that’s the whole idea behind automation. The problem these days is there are many old systems still running after 20 or even 30+ years, and sooner or later these can become troublesome to the point that the level of attention they demand is disproportionate to the service they perform. Let’s compare it to owning a car. Maybe you really like your 1992 Chevrolet Impala, but after 22 years and 250,000 miles, it requires a lot of attention. Brakes, front end, shocks, transmission, and maybe even a whole new engine. At some point it’s time to trade it in.
That’s fine if all you’re talking about is a standard passenger car. But let’s say your vehicle is a van that has been outfitted to carry lots of specialized equipment for your business. Buying a new van means transferring all that shelving and racks, and making many other modifications. In that case, the decision to trade in isn’t so easy because of other costs involved. It’s a different situation with different economic considerations.
In either scenario, the last thing you want is the vehicle to break down 400 miles from your home. Given your vehicle’s age and performance history, you know that anytime you go on the road there could be a failure, so you calculate the risk before you drive it on the trip. If it does break down, you could be stuck having to make decisions under pressure and may have to consider a solution that is not optimal simply for the sake of expedience.
The risks of keeping an old system
Every day that an obsolete DCS platform controls a plant, the risk increases that it will suffer a failure and interrupt production. Of course that could also happen with a brand-new platform, but risk grows with age.
For purposes of this discussion, let’s restrict the scope to digital control systems. No panel boards, pneumatics, electronic analog, or other subsets of those. Most of what we’re talking about would be considered second-generation DCS platforms installed from the late 1970s through the advent of MS Windows-based systems and their broad deployment in the late 1990s. Few if any of these systems are truly "original" since hardware fails in a variety of ways depending on what it is. The devices with the shortest lifespan are generally in the operator interface. No control system that has been used every day for 20+ years still has all its original keyboards, displays, and hard-disk drives.
The parts that seem to go on forever are basic controllers, I/O, and field wiring, but even these reach an end. Operators may see individual field devices go dark because an I/O card has lost a channel or perhaps all the devices connected through a specific card. Hopefully, there is a replacement, but one of these days all the spare-part inventory will be gone, and the boards sent in for repair will be returned as they were with the regrets of the OEM. Maintenance then will be forced to work with operators to perform system triage to make sure the most critical elements are still functioning while others fall into disrepair. Eventually, users have to start combing the Internet for hoarders, parts recyclers, and maybe even try eBay. (A recent eBay search on "Bailey Infi90" yielded 495 items. A search on "Moore APACS" yielded 186 items. Both turned up various boards, modules, and accessories.)
Is system age a relevant issue?
The age of a control system is not necessarily a problem in itself. An old DCS that is matched well with the process and has been carefully maintained over its life can do an excellent job. Like the specialized van, the user may have spent a great deal of time and effort getting it configured just right and optimizing the control methodology with specialized programming. Old platforms are capable of very sophisticated functionality, although there are fewer programming tools compared to more modern versions. Operator interfaces can be updated, at least to some extent, so new capabilities can be added.
Intellectual property tied to the process and developed over many years may be difficult to move to a new platform, so a company may decide to keep the old system and maintain it scrupulously. It will stockpile critical parts to keep it operational, but it will also have an exit strategy for when the time comes that a change is necessary. This sense of intentionality is what differentiates it from a company that simply gives in to inertia and doesn’t want to make an effort to upgrade.
Other users make improvements incrementally to reduce their risk profile and minimize a technology gap with more current systems. "A significant number of our customers are making continuous investments in their DCS architectures," says Sean Sims, vice president of lifecycle services for process systems and solutions at Emerson Process Management. "They see, and have realized, the operational value and overall lifecycle cost benefits of maintaining current levels of technology within their facilities. Some choose to invest in small, incremental steps whilst the facility is fully operational, whilst some choose to align their investment timing with major maintenance events such as a plant turnaround. Whatever the timing strategy, deliberate incremental investment results in reduced exposure to component and technology obsolescence."
Some older systems may have reached their capability limits. If a plant has reached the maximum number of I/O points a platform can handle, there aren’t many options for adding some new piece of equipment to the process that has another dozen or so field devices.
The Windows invasion
Control system architecture took a much different direction about 15 years ago as control system vendors began to shift from proprietary hardware and software to MS Windows-based platforms using more off-the-shelf equipment. Some vendors considered that an invasion where they lost control of their own destiny and suddenly found themselves having to follow Microsoft. Over subsequent years, owners of these proprietary systems often found the easiest way to upgrade was to "bolt on" a new system that took advantage of Windows’ flexibility. Control systems that have been untouched in this way still exist, but there aren’t many left.
In more extreme situations where a company is trying to preserve a legacy system that moved into the no-longer-supported category, users resort to putting on what is, in effect, a third-party SCADA system on top of the DCS to support a user interface, extend the system’s capability to communicate with other equipment using Windows or OPC, and move data throughout the system. This approach requires much integration effort, but in a situation where a company insists on running a platform well beyond its useful life, it may be the only alternative.
Migration vs. emergency retrofit
Ultimately, any control system will have to be phased out. It will get to the point where failures overpower the ability to repair or replace critical parts. Migrations can be easy or challenging to deal with depending on the level of planning and preparation, but when systems are failing and much of the plant is being run manually, moving to a new platform is truly nightmarish.
As Rich Clark, principal applications consultant for Honeywell Process Solutions, points out, it is possible to reach a point where it’s effectively too late to migrate because the human resources needed to carry out that program are totally engaged simply keeping the plant running. His advice is to plan while you still have options. "There are periods where you look at your risk curve and your opportunity curve, and both are pretty flat. You have plenty of options at that point. But eventually you reach a tipping point, and it’s hard to know where that is precisely. That’s when your options become limited and you have to do things in a hurry. If you’re the person managing that system, you have to set some red lines and some yellow lines, and look at your system at least once a year and see where you are. If you don’t, you can pass the tipping point and suddenly you’re dealing with a cascade of events that forces you to do a migration project in a way that has a higher cost and greater risk. You’re trying to do it on a compressed time schedule using people that have other jobs, and that pressure creates more risk.
"In some refining and petrochemical environments, you may have an outage or an opportunity to migrate once every five years. So when you’re doing your planning, you have to ask yourself, is my system maintainable right now? You might say yes, but there is a non-zero chance that five years from now it might not be. So there is an opportunity to migrate two years from now or wait seven years. If you’re planning, you have to take those kinds of things into account. We try to help our customers understand their real risk tolerance and help them identify opportunities."
In addition to human constraints, companies trying to make a migration into a crash program may find other unpleasant surprises:
- Documentation might be largely non-existent or has not been kept up-to-date.
- Critical information and programming may be on old memory media (e.g., floppy disks, tape drives) that aren’t readable anymore.
- Key individuals who were involved in system design are long gone.
- Companies that allow a control system to continue to run past its end-of-life often let many parts of the plant deteriorate similarly, so launching one upgrade program may open a complex can of worms extending into other functional areas.
Other risk factors
Risks related to unexpected plant shutdowns or loss of operator control are obvious, but there are other risks connected to overextending control system life:
Vanishing control strategy—In some situations, it is difficult or even impossible to capture the intellectual property from an old control system. The control strategy will have to be reconstructed if it cannot be extracted from the old programs or documentation, which adds another layer of complexity to the migration process.
Losing young operators—When millennial workers have to deal with a control system that’s older than they are, there is little likelihood that they will want to stay around and spend their shifts nursing a geriatric platform. Hiring is becoming difficult enough without adding this handicap.
Leave yourself options
If a plant or production unit is slated to be shut down at some specific date, there is a reasonable argument for trying to keep an old system working. But when looking long term, there is a point at which some kind of change will have to be made. Migrations are major undertakings, and they take many months or even years if carried out thoughtfully with careful planning.
When approached properly, migrations can be nearly painless, but if left to the last possible moment and done under the gun, the experience can be terrible:
- The constant pressure to keep the plant producing makes the process far more complex.
- Human resources will be stretched to the breaking point.
- Intellectual property can be lost forcing reconstruction efforts.
- Forced migrations are invariably more expensive and reduce the potential to exploit new functionalities of the new platform.
When the migration finally begins
Migration programs that happen at a reasonable pace with adequate planning allow the customer to determine what new functionalities can be used to the greatest advantage, and the best approach for training operators. On the other hand, forced migrations have little time for such analysis, and those customers push the new platform supplier to minimize changes that will require retraining anyone. The OEM or system integrator may have to go to great lengths, with corresponding costs, to make a current system look and work like an older one.
"During a migration process, some customers request a like-for-like upgrade of their operator interfaces to minimize operator re-training," Sims says. "As a generalization, we would suggest that this approach is not often necessary or advisable. The performance and efficiency advantages inherent in the new operating environment far outweigh the temporary effort required to retrain operators on new approaches, especially when one considers the cost reductions in, and performance advances of, low-fidelity operator training simulators over the last 10 years."
Clark says that this kind of request used to be more common, but doesn’t happen as much as it used to. "Fewer companies are asking for the new system to look like the old one," he notes. "Like-for-like is difficult to justify financially. You want to take advantage of the features that bring you financial returns."
It’s all about planning
Companies that are intentional about what they want to do and are willing to make appropriate investments along the way seldom find themselves having to make emergency upgrades. On the other hand, companies that allow automation systems to degrade through neglect probably see the same problems in all parts of a plant. If the DCS is not maintained, the rest of the equipment is probably also suffering. At some point, an operation will reach that tipping point where the costs and risks of failure outweigh the plant’s profitability.
"For customers who think they might be close to the tipping point, the decision to do nothing really doesn’t mean doing nothing," Clark adds. "You have to have a plan. What is necessary for the company at this time? If my system is getting older, do I need to be stockpiling parts? Can I contract some knowledge? What extraordinary measures might I have to take to maintain system robustness until I reach the point where I can take on a migration? Customers who do that kind of planning rarely, if ever, reach the tipping point with one of their control systems."
Watching the decline of a failing control system
Companies that insist on keeping a control system operating long past its obsolescence date typically go through five phases of deterioration with serious implications for the risk and cost profile. Sean Sims, vice president of lifecycle services for process systems and solutions at Emerson Process Management, explains the process this way:
Some of our customers choose a run-to-obsolescence strategy, which rapidly becomes more cost and risk prohibitive as the asset ages and the current technology gap widens. For those customers who choose to run past the official product obsolescence timeline, the following stages are typically experienced during this phase of the DCS asset:
1. Prior to the vendor declaring obsolescence of the platform or technology, the customer becomes aware of the pending lifecycle transition, but makes the intentional decision to extend the lifecycle of the asset through this lifecycle phase. The running cost/risk multiplier is still one.
2. The vendor declares a product obsolete, but provides a limited set of sustainability services to the market at an extended price point, and on a limited time frame. The running cost/risk multiplier now becomes two. Any system expansions in the plant to accommodate operational requirements are generally limited to installed spares already existing within the system at this point.
3. Once that OEM sustainability model expires, the customer is left to source hardware spares and technical support from within its own organization. This starts locally, and then globally if available, but the running cost/risk multiplier now advances to three. System expansions to support greater plant capacity are now generally no longer possible.
4. Once that internal supply is exhausted, customers begin to rely on purchasing spares from the open market or eBay, or refurbished parts from third-party suppliers who provide such services on a limited scope basis. The running cost/risk multiplier is now five. Technical support for the product from any source in the market becomes extremely limited as qualified people retire, and the price of upgrading the system begins to exceed by a considerable amount what an upgrade program would have cost if it had been executed before this stage in the lifecycle.
5. Once the quality of spare parts from the open market falls below 20% usability and the technical support for the product declines to critically low levels, the customer eventually experiences a catastrophic failure that cannot be rectified in the context of the existing architecture. Finally, upgrading the system in this environment of extreme time pressure due to consistent production impacts caused by the degraded system availability, reliability, and performance, is such that the running cost/risk multiplier reaches 10.
Our goal is to keep our customers from transitioning to the last two stages that can have real consequences to their bottom line. Our number-one message is risk mitigation during the migration, combined with future production improvement opportunities, and positioning the automation infrastructure for the next 20-year cycle.
Peter Welander is a contributing content specialist for Control Engineering.
- As control system platforms age, the likelihood of breakdowns increases.
- Many plants are still operating with obsolete platforms, risking production interruptions.
- As problems compound, undertaking a migration program becomes more complex and expensive.
For more information, visit:
Read more about migration projects below.