Reducing risk and responding to threats in industrial environments

Know the overall objectives of operational technology (OT) cybersecurity and to ensure progress against risk reduction and threat response. Three steps for reducing risk and threat response are highlighted.

By John Livingston October 20, 2021
Image courtesy: Brett Sayles

Industrial cybersecurity leaders – including the C-suite, chief information security officers (CISOs), security teams and operational leaders – are increasingly realizing the potential financial, operational and safety impact of cyber events. Attempting to get their hands around securing this challenging part of their networks, many leaders have kicked off efforts by separating their information technology (IT) and operational technology (OT) networks, gaining visibility into the arcane world of OT assets, or gathering data from the OT networks into incident detection processes to identify potential threats.

Some require specific security actions based on regulatory structures. Activity is bustling. Meetings, planning, network architecture discussions, POCs, etc. are keeping teams incredibly busy as they also try to keep their plants operational in a world of declining resources and COVID-19 limitations.

People must stop and ask fundamental questions: Are people making progress? If they aren’t a victim this week, month or year, are they successful? Are they wasting money or spending too little? They must start treating OT cybersecurity with the same set of objectives, metrics, targets and performance management that they treat operating a plant, railroad or power grid.

Let us offer a point of view which we welcome others to add to or provide alternative perspectives. We believe there are two primary objectives of OT security that deliver the ultimate objective of reducing potential impact to OT operations:

  1. Reduce risk
  2. Respond to threats

This is just restating the obvious. But, in fact, there is an argument that this foundation begins answering questions from the top of the organization: How do people know if they are making progress or being successful? Are they actually improving their risk posture? Are they equipped to respond to a real threat or just detect anomalous behavior?

Many industrial organizations want “visibility” or “detection” but aren’t clear on the ultimate objective or how to measure it. If there are a lot of detections is that good…or bad? If there is visibility, have people increased their security? These two core fundamentals and the key components of each help determine the best path.

Three steps to reduce risk in industrial environments

1. Create a real-time view of the risk status of the OT environment. The first step to reducing risk is risk awareness. Most organizations start this journey with a vulnerability assessment of their OT environment, then estimate the potential likelihood and impact of each potential risk. This is a necessary, but insufficient step. A one-time or infrequent assessment is outdated immediately and makes it very difficult to track progress in reduction over time. Success in risk reduction requires a constantly updated view of the risk.

2. Take remediating actions to reduce risk

Risk reduction requires executing specific actions to reduce those specific risks. If the assessment identifies risks from unpatched systems, insecure configurations, dormant or insecure accounts, and users, poor access controls, etc., the next step must be to reduce those risks. Actionability requires the organization to manage its OT endpoints. They must take back control from vendors and ensure configurations are hardened, network devices are updated and properly configured, users and accounts are cleaned up, etc. These endpoint actions are an example of why risk detection is not enough, and why they must close the loop to remediate the risks.

3. Track and report on operational excellence. The great thing about securing operational environments is that the leadership and staff are comfortable with rigorous operations management. Security requires the same kind of operational excellence as manufacturing or supply chain. Foundational to operations management is the tracking of performance on critical metrics and reporting on performance. Whether it be “red to green” dashboards or percent completes, a strong risk reduction program establishes clear metrics and monitors them over time. This reporting should also include who is responsible for each metric. In security, this requires having uncomfortable conversations with operational leaders of their personal responsibility for maintaining and improving the overall risk profile of the OT systems.

Three elements for effective threat response

1. Defined response process and plan. Incident response plans are common in almost every cybersecurity standard because they are critical in the ability to stop a potential attack in real-time. But many incident response processes stop with a set of high-level procedures or policies such as whom to call when someone sees an issue, how to communicate with authorities and who to use as an incident response vendor.

In the past several months, the industry has seen first-hand that incident response plans need to be much more detailed and specific to the individual IT/OT environment. The Colonial Pipeline event highlighted the risks of limited response planning in the OT environment. The solution to their ransomware involved shutting down operations. This may have been a necessary step, but the key to a strong incident response plan is to identify the least disruptive response (LDR) for each threat. The LDR is built by understanding the specifics of the OT risk posture (part of step one of reducing risk). To define the least disruptive response, the organization needs visibility into the risk posture of each asset and knowledge to reduce the impact of different types of threats. This goes beyond the paper-based “who to call”.

2. “X”-dimension detection. “XDR” is a growing security industry buzzword to define the broad telemetry required to contain modern threats. In OT, “XDR” is often thrown out because of the risks from automated response actions. But we should not throw out the concept of “X-dimensional” detection. This refers to gathering a wide set of data from the OT systems – endpoint logs, user behavior, network flows, firewall log, even physical process alarms – and using integrated analysis to identify potential threats. In the IT world, no security leader would accept a single form of telemetry such as packet inspection as the answer to detection. Those in OT shouldn’t either.

Integrating these various forms of telemetry also reduces the false positives that cripple the SOC teams and keep them from responding to the most critical alerts.

3. OT-safe, rapid, least disruptive response. As mentioned, organizations need plans for the LDR – least disruptive response. But they also need to implement response actions in a rapid, but operationally safe fashion. A response plan is only as good as an organization’s ability to execute it in the heat of the moment. The plan should be backed with the people, processes and technology that allow the security team (including both security experts and industrial process experts) to take the security actions necessary to stop the threat. This would include: removing a specific user, changing passwords, eliminating certain ports and services, patching a system, etc. Too often in the world of OT, these steps are manual or require vendor involvement amid an event. For rapid response, the industry needs the ability to take targeted response actions when necessary.

These response actions should be managed by a team of security and operational personnel. Unlike IT where automated response is becoming the norm, OT believes that response requires a human to review the potential threat as well as the potential negative operational impacts before executing the action. We call this the “Think Global: Act Local” approach.

Organizations are reacting to the emerging threats to OT security and beginning to take action. This is great news. However, everyone needs to step back to determine what the overall objectives are and how to ensure everyone is actually making progress against the two key elements – risk reduction and threat response – before taking actions that may not lead to true security improvement.

– This originally appeared on Verve Industrial’s website. Verve Industrial is a CFE Media content partner.

John Livingston
Author Bio: John Livingston leads Verve Industrial's mission to protect the world’s infrastructure. He brings 20+ years of experience from McKinsey & Co., advising large companies in strategy and operations. Recognizing the challenges of greater industrial connectivity, John joined Verve Industrial to help companies find the lowest cost and simplest solutions to their controls, data and ICS security challenges.