Can your communication networks support an automation upgrade?
Before you deploy a new HMI, install new software to support mobile devices, or add a loop tuning utility—make sure your networks can handle the increased traffic without bogging down.
Companies considering process plant upgrades know they have to work within the physical constraints of existing equipment. Pipes, pumps, and valves have limits to the volume of fluids they can handle. Electrical switchgear and wiring has a maximum current carrying capacity. If it’s necessary to move beyond those limits, then upgrades or replacement of items that create the bottlenecks will be necessary.
The same applies in automation systems. A PLC or other controller has a maximum number of I/O connections, which is easy to see, but other limits aren’t so obvious. A pressure drop in piping may signal excess volume in a process, but what are the signs of an overloaded control network? The early signs are more subtle.
When a new plant is designed, engineers typically build a level of excess capacity into all the systems to cover themselves and make sure the owner isn’t disappointed by some unanticipated choke point. This applies throughout a facility, typically including the digital networks that facilitate communication among field devices, controllers, HMIs, and other process-related computing systems.
Those networks, like the process piping and electrical wiring, were designed to handle anticipated volume plus some extra. The problem is that designers 20, 10, or even 5 years ago didn’t anticipate all the things users have added to their plant networks. Back in 2009, who would have thought they might be using an iPad to remotely access a process unit?
Little by little, that initial extra network capacity gets used up, although you might not realize it. If your network can handle the existing traffic, you may not know it’s close to its limits until one more new thing begins to bog the system down. It may be a major upgrade like a new HMI, but even small additions of things like software utilities can make a difference. The problem with most companies is they don’t know how close to they edge they are in terms of network capacity.
How plants get in trouble
While there are charts that tell you exactly how much liquid you can expect to put through a pipe or the ampacity of a given size of wire, network loading isn’t as clear-cut. Measuring the performance of your network isn’t easy, and there are few guidelines to suggest how much traffic is practical.
We often talk about automation system controllers being "overloaded," but this usually has more to do with moving data among controllers than with inadequate brute computing power. It is rare to outstrip a processor’s capability to make loop calculations, but common for controllers to slow down when they are asked to continuously communicate large amounts of data.
Consequently, some customers without existing network problems simply launch into an automation system upgrade assuming they won’t have problems. More sensible individuals may realize there needs to be some sort of network traffic analysis prior to the project, but this is often something of an afterthought.
As an automation solutions provider, we often see customers discover there are few tools available to measure existing and projected new network traffic. Control system vendors include diagnostic software for existing systems but they don’t have capabilities to predict what network loading will look like for a system after modifications are carried out. It seems inordinately difficult to get a simple answer to the question, "Can our networks handle this new load?"
As legacy systems reach their end of life, it often becomes a viable alternative to replace the HMI and front-end components, but leave the controllers in place. As vendors jockey for additional installed base, Vendor A convinces the user that its HMI can work seamlessly on an automation system from Vendor B. Maybe it can, or at least it did on Vendor A’s test bench. There may even be a number of successful projects where that combination has been used, and the salesperson will cite these as reasons why there is no need to worry.
The salesperson may be absolutely correct as some cross-platform combinations work very well, but the most meaningful question is whether the combination will work in this specific situation, and there are lots of possible variables in play that are site-specific.
If the salesperson responsible for the project truly understands what is involved and is giving the customer proper guidance, there is a strong possibility for success. However, as independent automation solution providers, we have seen more situations where the choice is made and approved without adequate research. When the purchase orders are issued and problems begin to emerge, our services are called upon to solve the compatibility issues.
Fortunately, there are no infamous case histories where such a cross-platform DCS upgrade caused an entire system to crash when it was first turned on, but there are many where a project in its late stage suddenly had to be delayed for a few weeks to solve some network loading issue. As with any project problem, addressing these types of issues early is much less expensive and time-consuming than late fixes.
Tools and guidelines
While there are many tools for evaluating network health in office and commercial IT systems, there are far fewer for looking at process automation system networks, but there are some choices. In some cases, normal IT tools can be used, but judgment is required to interpret the evaluation results.
A study often shows the number of packets moving around and how easily they move, with a specific examination as to the number of collisions. Most automation systems have some sort of internal diagnostic utility that monitors how quickly inquiries get answered, which is related to network loading. When the HMI needs information to refresh the operator screen, does it come through on the first try, or does it get delayed? Does the HMI have to retry often?
Most systems can count situations where information is held up. This is good to know, but it might not help much when trying to figure out how much more bandwidth is available. Most systems can also give you an idea of controller loading, and as mentioned earlier this typically relates more to movement of information than numbers of calculations. This is useful when you are trying to determine how tasks are spread out around your system. You may find certain controllers carrying a disproportionate portion of the load, and communication can be improved by reassigning some tasks to less loaded parts of the automation system.
These tools are certainly useful as far as they go, but they aren’t very predictive of success as you consider adding new things to your networks. Of course, if you find that your networks have bandwidth to burn, you don’t need to worry, but this isn’t often the case.
Bandwidth costs money, and probably cost more back when most plants were built. There have been few technology developments over the years that have caused bandwidth requirements to go down and many with the opposite affect, so more companies are closer to the edge than they realize.
Will this upgrade work?
Let’s go back to the hypothetical cross-platform migration project discussed earlier. For the sake of argument, the account manager for Vendor A is knowledgeable and he or she has worked on similar projects in applications similar to yours. You have every reason to believe that this project should be successful, but there is still a nagging question: "Will this work in our situation?" This is a valid question, and you should indeed ask it and determine the answer before the project starts. Never assume that everything will come out well because there are many variables in play, probably more than you realize.
The high number of variables tends to keep vendors from guaranteeing performance on your automation system. Each company and situation is different, so make sure you understand what you’re buying. The most useful information you can obtain is often gained by talking to someone at a similar company that has undertaken a comparable project, but it can be difficult to compare situations. Even if your plants are mirror images of each other, you might decide to use more elaborate HMI graphics that require more information, or you might make the refresh rate higher. Both of these increase your network demands and make plants difficult to compare directly.
Ultimately, the decision is about the level of risk you’re comfortable with. If you determine that your networks are running at 80% capacity, is it safe to add a new HMI that will increase traffic by 17%? How accurate are your numbers, and how close to the limit are you willing to go? Knowing that there are no hard and fast rules, this becomes a risk assessment exercise that needs to be completed.
Solving network problems
A while back, our company was involved on a project with a customer trying to answer the network loading questions we’ve been discussing. Our customer commissioned the automation system vendor to carry out a major network study in anticipation of a migration project. This is a pretty rare occurrence, so we were interested to see the results.
The study made a series of recommendations, including an observation that the alarm management platform was consuming 20% of overall network bandwidth. It also found specific HMI graphic functions creating an inordinate amount of traffic by polling a very large number of points with unnecessarily high frequency.
Such studies are not cheap and consequently aren’t done all that often, but in this particular case the study easily paid for itself because it identified problems that could be quickly addressed before escalating to more expensive solutions. Of course, every situation may not have those kinds of opportunities, and vendors are sometimes reluctant to undertake such network studies because the results may not be as conclusive as customers expect, and there may not be clear and easy solutions.
No plant has every device configured optimally to only the level of performance necessary to consume the fewest possible network resources. If everything works, there’s no incentive to spend resources digging around in all the device settings.
Some tweaks are always possible, but you might not be able to find enough of them to have a large enough effect to solve an emerging problem. Tweaks might help you if you’re on the borderline, but if controllers or HMIs might become seriously overloaded due to an upgrade, major steps to improve network usage may be unavoidable.
Moving demand and adding capacity
Solving network problems by adding capacity to handle new demand generally involves redirecting traffic, increasing the number of controllers, and possibly beefing up the network by adding switches, routers, and/or cabling. Depending on the age of the existing automation system, adding controllers may be a fairly easy step, with modern systems more amenable to these types of additions.
Projects change direction during the actual implementation, so it is important to retain some flexibility. A common scenario is, "The original plan was to buy three new controllers and keep our existing 10 legacy controllers, but now we realize that we’re going to have to buy five new controllers and move some of what was on those legacy controllers to the new ones. So in addition to adding the new equipment with its new I/O, we need to spread out some of what was already there." This type of situation occurs frequently with upgrade projects, and can be the cause of many delays if there’s poor planning prior to implementation.
Some equipment or software consumes lots of bandwidth. In some cases it is justified, and in others it may not be. One common characteristic of high bandwidth consumers is the need to pull lots of information from lots of sources and do it frequently.
The customer study cited earlier found an alarm management program was causing excessive network loading. Another problem we see frequently is loop diagnostic utility programs as they need enormous amounts of data to work properly, which can tie up network resources. But in many cases, these diagnostic programs don’t need to run continuously, so their use can be limited.
If you are trying to identify where you might have some bandwidth hogs lurking in your systems, here are some places to start:
Controllers are too heavily loaded—This is a common issue and requires a plant to offload the controllers as a part of the upgrade. This often means adding additional controllers to alleviate the problem, which can be a significant task on top of a normal migration.
Cross-controller or cross-network communication—If there’s a lot of cross-controller communication bogging down the network, it might be feasible to shift control strategies among controllers to alleviate the problem. A good bit of engineering is required to assess how control strategies are adding load to the control network, and how they can be relocated to improve the situation.
HMIs—How often do your operator screens need to be updated? The nature of your processes and how quickly something can change will tell you that, but most companies have their refresh rate set too high. If the HMI was initially configured with a very fast refresh rate, then it might be possible to slow it down, especially for static information like descriptors and units.
Mobile devices—Management likes to remotely access plant performance data, but does it have to be updated every 10 seconds? Wouldn’t 1- or 2-minute updates be enough? Reducing mobile device update rates is a simple fix which can substantially reduce network traffic.
Third-party programs—Asset management, loop tuning, historian, alarm management, APC, and other software programs often need large amounts of automation-system data to perform their functions. Investigate each one and consider the possibility of changing configuration settings to reduce communication requirements. For example, the APC program may work just as well by checking operating parameters each minute, instead of every second.
Hardware/software issues—Incorrect firmware, software revisions, or other configuration issues can cause communication errors and greatly degrade network performance. These types of issues can usually be easily identified and fixed.
Fast loops—Moving control functions closer to the actual equipment and away from central controllers reduces traffic overall, which is particularly important for loops requiring fast control. Distributed control can make responses faster for flow and pressure loops that need it, maybe by installing smart instrumentation and valves capable of running a loop locally.
|Table: Automation system network bandwidth hogs|
|HMIs with complex graphics|
|HMIs with excessive update rates|
|Third-party programs continuously requesting large amounts of data|
|Hardware and software bugs and issues|
|Loops requiring fast response and control|
Most long-term solutions involve a mix of reducing network traffic by addressing bandwidth hog issues, directing traffic more efficiently, and adding capacity. Building and maintaining problem-free automation system networks begins with understanding which items need to talk with each other, how much data needs to move among them, and how frequently data has to be refreshed. When that information is well in hand, it will be much easier to determine how to proceed when it’s time to upgrade an automation system.
Chad Harper, CAP, PMP, is senior director of technology for MAVERICK Technologies.
- Network bandwidth, like any plant asset, has capacity limitations.
- Understanding your current network usage will help determine if new capabilities can be added.
- Devices can be reconfigured to help free bandwidth when network traffic begins to slow down.
For more information, visit:
Read the Real World Engineering blog at www.controleng.com/blogs