How to protect embedded systems in OT cybersecurity
Understanding cybersecurity issues in embedded devices is a complicated, ongoing process, but it's well worth doing ... and doing right. Here are a few steps to get you started on the journey.
Many wonder where to start when securing embedded systems? Here are two pieces of information that can help guide you when understanding cybersecurity issues for embedded devices and the never-ending knowledge acquisition process:
- Reflect on what you learned yesterday, a week ago and then a month ago. First, you need to give yourself credit. Second, know that tomorrow you might restart that entire journey, but today, you clearly know more than you did yesterday.
- Be comfortable being uncomfortable. It’s not easy, especially with embedded systems we could knock offline or brick, but do not be afraid to learn, experiment with easier layers of the “art” and build up confidence. Everyone has to start somewhere.
Consider these six questions you should be asking about protecting your embedded systems.
1. How do we know about vulnerabilities in firmware we use? Is there a distribution list we can subscribe to?
Unfortunately, the answer is complicated when it comes to defining, reporting and matching products to a vulnerability disclosure (e.g., a common vulnerabilities and exposures, or CVE). Academics try to tackle the CVE to CPE (product) challenge, but for the end user, the simple straightforward answer is to follow feeds:
- Follow the CISA ICS Cert advisory page.
- Follow your country’s advisory page.
- Follow various product vendor security portal pages.
- Sign up for email distribution lists where possible for your various product vendors.
- And subscribe to experts on social media or through a service.
Pro tip: Make the Industrial Control Systems Cyber Emergency Response Team (ICS CERT) and others like it your browser’s home pages, so it comes up whenever you restart your browser of choice (e.g., Chrome or Firefox).
Assuming you have an asset inventory to cross reference, or a platform that has frequent common vulnerability scoring (CVS) updates, you should see new vulnerabilities or notices matched against the assets you own in order to track that however you wish.
2. How does an organization identify ICS devices, and automate a “live” inventory of the ICS devices to drive vulnerability management?
As mentioned above, there is a need for a comprehensive list of the assets you have deployed and in scope. This can be done in a variety of ways, including through passive detection of devices via their network presence (aka transmitted packets), examining proprietary original equipment manufacturer (OEM) files that contain information about the asset’s deployment, and polling the devices in a safe and validated fashion.
While there has been a big, industrywide focus on using passive solutions over the past several years, decision makers need to understand the limitations of passive detection before thinking these solutions will result in meaningful risk reduction.
Therefore, an organization identifies incident command system (ICS)/IACS/operational technology (OT) devices through a process that is ideally modern (and has only IP-based assets). A general time-consuming and naïve method would be:
- Collect and aggregate an understanding of the network and the assets on it (even if this is done by collecting spreadsheets).
- Revise with asset owners and limited tests (e.g., offline data, or “a single ping, please” method).
- Use OEM tools, screenshot, scrape, and validate/cross reference obtained information.
- Periodically repeat and update, and/or do it whenever assets are introduced/retired.
Seems painful, right? It is. At the end of the day, you need more than just a media access control (MAC) address, IP, and a vendor OU match to really make sense of the asset’s attributes.
The detailed information you need for understanding an embedded ICS asset is generally not present in most traffic unless specifically queried, and potentially transmitted under specific conditions (e.g., startup). Merely looking at packets as they fly by in a steady-state facility will provide little insight and be full of false positives (or missed assets hiding behind a transmission control protocol/serial gateway for example).
Could it be done on the cheap? Yes, but it’s difficult because you need to know:
- Concisely how each of those ICS devices communicates.
- The exact commands/requests to retrieve the relevant data.
- How to parse/transform the results into something usable.
- How to programmatically get the result to the consumer.
It is not trivial for non-commodity systems or even with OpenSource tools. Do not fall into the trap of “Oh, but Modbus is openly available, or EtherNet/IP (also called CIP).” Sure, the specs are widely published, but just because you know the specs does not mean you know how to account for local “dialects” and device idiosyncrasies.
The edge-cases will break you, and when you wind up with a nonstandard protocol, you may wish you saved yourself the complexity by working with a company that spends the time to understand the devices or has the knowledge/experience to do so.
3. What are methods for safe discovery and common ways to mitigate vulnerabilities in legacy devices?
Obviously, that is a loaded, two-part question, but there are a few paths to the discovery of vulnerabilities:
- Leverage your security feeds and build up your own understanding of a specific system under consideration (SUC in ISA-62443 language). There is a good chance you can obtain relevant information from a sibling product from the same vendor or another vendor, which would allow you to begin forming hypotheses.
- Make intelligent observations using an offline system or a system that is well understood in a NONCRITICAL process/function. This means talking to it over Telnet, looking at the files on the SD card, etc. It is a lot of work, and it is a knowledge-based journey. However, that is only a mile wide and an inch deep. Real knowledge is needed to understand a variety of issues specific to that product, the embedded hardware, the OS (and how it works), and more.
- Read product documentation for hints that are not specifically written in security language. For example, this product will break if someone does this, or we have to use credentials to stop X or disable Y functionality.
- Use passive methods to find vulnerabilities. For example, what can you see with Wireshark when using OEM tools to talk to a device?
- Test if you have an appropriate setup and organizational commitment to dealing with devices you may break in the process. Common methods to test would be to use protocol fuzzers and automated but highly monitored, step-by-step test stacks, or apply tribal knowledge and other technical skillsets.
If you find an issue, it is repeatable, and looks like it is not addressed by a disclosed public CVE or through a vendor application note, congrats. Your next step would be to report it to the right channel (see CISA CVD process).
Mitigating and treating the risks around several legacy devices can be a complicated process. From an asset owner’s perspective, you may wish to:
- Have appropriate policy and guidance around legacy devices.
- Ensure you could do a full restore start to finish for them (including taking them out of physical inventory and programming them).
- Reduce the attack surface on the device itself by wisely disabling functionality.
- Prevent network access to vulnerable devices or segments (e.g., zones and conduits).
- Have in place a detailed asset management strategy that monitors for changes on the device versus looks at out-of-band project files that are likely out of date.
- Have a plan and practice restoration of last known good configurations and compiled logic for embedded systems.
- Have a vulnerability management program in place that tracks vulnerabilities and potential firmware upgrades to embedded systems.
- Prepare and test your incident response procedures.
- Adequately protect the privileged systems that interact with vulnerable devices or their network segments. Chances are an attacker will use one of those to execute OEM functionality and affect your process versus targeting the device directly.
Regardless, not all of the above apply specifically to the embedded devices we consider vulnerable or insecure, but they are compensating controls designed to add additional layers of protection to deter or alleviate risk to a tolerable level.
4. In embedded systems, are there firmware configuration features that can be used for least functionality, least privilege, management of change, etc.?
Embedded devices are not typically open platforms (even though some may run Linux), and they vary from vendor to vendor, or even from product to product. However, if you wear your MacGyver hat, you might occasionally apply functionality within these devices to provide controls. For example, using programmable logic controller (PLC) programming best practices, not allowing Modbus to write to them, or changing the default passwords and monitoring logs.
The reality is that in many cases, these devices and those that are even older require a higher level of attention as well as some creativity. For management of change, the question is how do you gather information from the device and actively monitor it for change? Thankfully, there are solutions for that, but they likely are not on the device itself.
On the other hand, many devices have some level of role based access controls (RBAC), and you can set the username/passwords. But you should protect them from physical access, set the “run key” to read-only and secure the privileged workstations that have the OEM software, which can usually communicate to these systems at will. The latter point highlights another aspect – by not using default credentials, you might buy yourself precious seconds to detect anomalous activities and make it a bit harder for an automated script or program to accidentally stumble into an embedded system. Assuming, of course, you are monitoring and alerting/acting upon the logs on surrounding systems.
In many cases, you may have a huge fleet of already-deployed devices, and merely changing the credentials requires a shutdown or scheduled maintenance window. Utilizing RBAC functionality in existing deployments may require some thought.
Another option is to use industrial firewalls that apply access controls at the protocol level, but that is not specifically an answer to this question. Instead, appropriate firewalls are an additional compensating control (to be used to secure the conduit/zone) and require onboarding into your asset management system.
5. Can I “patch” firmware for OT/ICS devices such as RTU, IED, and PLCs?
Another question that starts with the typical engineer response – it depends. Unfortunately, embedded systems, when compared to their IT/OT brethren, have challenges in the upgrade/hotfix department (see the Embedded Devices & Firmware in OT whitepaper). But even if a fix is present, there are some operational issues you need to understand:
- What is in the update? What does it address? Does the fix improve stability? Solve a critical flaw? Add functionality?
- Is this update necessary? Does it need to be deferred? Immediately deployed? Never applied?
- Will this firmware update have a possibility of failure? If so, what is my rollback plan?
- Will this firmware update require a scheduled outage and stable power? If so, you need to make sure it’s scheduled, and all change management processes are followed.
- Will this firmware update require a specific process that must be followed as per the OEM guidance? If so, make sure the statement of procedure (SOP) is followed in accordance to the OEM and your organization.
If you still deem the update to the device as relevant and necessary in accordance with your organization’s criteria, then yes, patching OT devices is feasible. But it won’t be a frequent activity like the patching programs for perimeter firewalls or commodity Windows assets. Again, that is why it is critical to consistently and frequently monitor assets directly to determine changes in your organization’s deployed asset inventory.
Patching or performing firmware updates is possible and feasible on embedded systems should an update be available that fits your organization’s criteria. It should not be solely relied upon in the same way as in IT, but rather used to ensure stability and security of the processes and communication it is supporting.
Patching commodity OT systems such as embedded routers closer to the perimeter (e.g., CISCO’s) is a far less concerning task, and they should likely be prioritized given your organization’s dependence on them for security. However, you must still abide by appropriate risk and change controls. Do not forget to include networking infrastructure in your asset inventory.
6. How can I change the culture to create a better cadence for patching and updates?
There are several ways to affect change positively, whether outside in the community or within your organization. Usually, it’s done through human-to-human means initially.
To improve culture, start by demonstrating that it’s possible by making a thoughtful but rigorous engineering-based organization that considers all aspects versus outright denying them. One potential idea is to implement that culture beginning at the interview stage in the hiring process, so you can instill what is to be expected early instead of allowing employees to be “poisoned” by current values or attitudes. This way, you grow a team of positive people, but it’s also a step toward taking an active part in changing security culture.
Another piece of advice is to champion patches and updates for “low-hanging fruit” or low-risk systems. Build a reputation for following process, and be rigorous, detail orientated, understanding of your environment and accepting of insight from local site owners who have a metric ton of “tribal knowledge.” Build relationships around trust. If you stick with it, have adequate support and don’t rush into things while taking one victory at a time, you can change the patching culture and cadence for a lot of systems.
Also: Do not stop patching/maintaining/securing systems or their surrounding adjacent systems. Security degrades over time, and it consistently needs attention or it will rot. You still change the oil in your car, even if newer, better versions of it exist. Otherwise, it will degrade and then catastrophically fail.
The other aspect to this question is how to combat cybersecurity issues today with an eye on the future, while also affecting vendors to create more secure products. Here are some ideas:
- Make investments in security that enable and multiply the effectiveness of other technologies, processes and resource’s time.
- Invest in getting your organization to a sufficient level of cybersecurity maturity, and maintain it through consistently applying the basics. They have been proven to provide a measurable effect on residual risk and often provide a reasonable level of protection.
- Add security language to requests for proposal (RFP), and validate any claims by the vendor before moving to an installation.
- Use the more secure options in products, and be a part of the group that uses them. Don’t allow devices to be set up as is and left. The best chance to add security is early on when deploying systems/replacements. Transition systems away from using insecure options or defaults as you get to them (or as they retire).
- Ensure you have security processes that require (and validate) cybersecurity as part of factory acceptance testing (FAT) and site acceptance testing (SAT), also known as component first article testing (CFAT) and customer satisfaction testing (CSAT).
- Report issues to vendors or CERTS. The common vulnerability scoring system (CVSS)/CVE system has flaws, but awareness forces companies to fix issues and forces your organization to manage them (especially if you are in a compliance-orientated environment such as one that needs to abide by NERC CIP).
- Create end-to-end tested processes for overall cybersecurity governance and adequate training plus livefire exercises to ensure your resources are adequately prepared to reduce disruption and restore assets in the eventuality of an event.
- Be engaged in community efforts, and ask hard questions to vendors and industry experts. Try to be a part of the change versus a stonewall. Also consider experiences and how other industries are doing things. Chances are, there is prior art we can use in the ICS/OT world.
Creating change requires shifting culture, changing the relationship demeanor with vendors and holding people accountable (but not punishing them). This is obviously a huge lift and it will be continuous, but it is a marathon, not a sprint. So don’t wear yourself out, and try to make wise choices that have impacts.
– This article originally appeared on Verve Industrial’s website. Verve Industrial is a CFE Media content partner. Edited by Gary Cohen, Senior Editor/Project Manager, CFE Media and Technology, email@example.com.