SCADA HMI software security: Understanding and preventing SCADA viruses, intentional and unintentional

Cyber security: Exploring some of the technical concepts will help an end user understand and prevent security flaws when creating SCADA and HMI software applications and the underlying networks architectures.

By Marcos Taccolini July 5, 2015

On July 14, 2010, fiction reached right out of the movies and became reality when the first supervisory control and data acquisition (SCADA) virus, Stuxnet, was used to put down and break a significant portion of the entire Iranian nuclear infrastructure. The knowledge of that event stayed more restricted to the automation community, but with the other data breaches cases in well-known brands, such as Sony and Target, the cyber security awareness went mainstream at the corporations.

Along with interruptions caused by an intended threat, unintentional misconfiguration or inadequate operations create flaws in security; as current systems are highly integrated and composed of many layers of software and, in most cases, the person who is creating and operating these systems does not necessarily have a deep understanding of how software layers interact.

Creating scenarios for a cyber-attack

The high level of integration in distributed systems, the level of complexity and the many layers of software used in current automation and information technology systems, create a propitious scenario for the creation of malicious threats, as seen in recent years.

But, as it relates to the security of SCADA installations, it is not only the intended threats that must be prevented.  Complex infra-structure and the many software layers in place are not completely known by whoever is creating and operating the automation in plants. Therefore, unless there are good standards in place, it is possible and even likely, that operation errors, inadequate maintenance, and update procedures will affect the system reliability and stability, creating effectively a non-intentional virus on the system; or open doors to those with malicious intent.

To systematically identify potential threats and avoid them, there needs to be a verification matrix. In one dimension, a user need to protect the software and IT infrastructure on which the execution of the SCADA human-machine interface (HMI) software relies; in the other dimension the automation project itself needs to analyzed, specifically, the SCADA/HMI project configuration and operation.

The first dimension of the security analysis is the basic hardware, software and network infrastructure that deals with operating system security patch update procedures, identification and password protection, networking and so on. This analysis are not specific to SCADA or HMI systems and should follow established IT standards, there are already many references available. Therefore the focus will be on the other dimension: the development of the software automation project itself, including the SCADA HMI configuration, execution, and deployment.

Intentional, unintentional threats

The need to prevent intentional threats is critical. Not only for the political scenario of external enemies tampering with production infrastructure, but also for a much more simple reason: if it is possible to create a threat, then someone will. Even if it is just for the technical challenge, as many computer viruses were created not for profit or specific attacks, but to satisfy someone’s ego.

The evolution of programming skills, software and communication technologies over the past 15 years have advanced far beyond the updates most automation software tools and products provided. Even the users that upgraded to the last version of their tools were not protected, as most of those solutions are based on legacy technologies that any tech-savvy teenager can break if he wants.

Non-intentional threats

With the increased complexity of systems it may be impossible to completely simulate every imaginable scenario during commissioning site testing, the quality and operational stability must be ensured on unit testing with intrinsically safe technologies and architectures. The fact is, many systems are not actually employing newer and more secure technologies; instead they create “wrappers” and layers of code and modules around their legacy code and technologies; to protect their investment in same.

This scenario presents great risk and is potentially unsafe, as it leaves many core and kernel components extremely vulnerable while adding potential random problems, when attempting to run the legacy components on new computers, with new operating systems, new networks and new operational procedures. In this environment errors are likely to occur, perhaps not during normal operation when it would be easily detected, but exactly when it’s most important that the supervisory control system not fail: during high activity stress or abnormal process situations, network or computer failures, multiple alarms, executing previously unexecuted error path code or system recovery code, or incorrectly executed commands. End users can expect those potential errors to manifest in precisely the worst possible moment, an abnormal process situation to handle in a mission-critical situation.

An example of a non-intentional system break was in an application when during the night shifts communications with the SCADA-HMI and critical PLCs would shut down randomly, causing the process to stop. The system was running perfectly for many months and those problems started with no modification on the project configuration, therefore there was a suspicion of a virus or intentional sabotage.

After long nights monitoring the operators and the process, those involved learned that someone had enabled the “screen-saver” on the operator’s computer; and due to a hardware problem on that specific computer when turning off the display, it was also turning off the “8250 chip used for RS-232 communication” in a way that only a power-down and power-up could restore the RS232 port. The problem was not random or malicious, it was happening on the night shifts when the operator left his position long enough for the screen-saver to activate; and the IT technician who decided to enable the screen saver  to save the screen and some energy was unaware of the side effects on that hardware.

The given example has the typical steps taken on the non-intentional threats: (a) the operators and IT technicians didn’t have the full knowledge of the many layers of complexity encompassed by the software and hardware automation interaction. And that is the way it should be, it is not possible nor can it be expected that they could know it all; (b) minor system or environment modifications or operations procedure changes, which were supposed to be completely harmless, caused unwanted side effects; (c) side effects can propagate to bigger problems due to undetected latency errors and due to unpredictable situations on the connected layers; (d) the resulting error or problem can stay undetected for a while, or seem to have a random behavior.

Project cycle security

To systematically identify potential threats, analyze the whole project cycle, which in this simplified model, is composed of four steps:

  1. Selection of technologies, architecture and tools; 
  2. Project configuration and programming;
  3. Deployment and commissioning;
  4. Operation and maintenance.

Learning more about each stage helps incorporate security in the project.  Step 3, deployment and commissioning, is one of most vulnerable areas for most systems.

1. Technology, architecture security

In the same way someone can’t just add accessories such as air bags, sensors, and additional technologies to a car built in the 1980s, to enhance the security for the driver and passengers, it is necessary to renew the software foundations used in current automation systems to make them secure for today’s operating environments. This renewal will also deliver all the potential benefits available in current technologies that are not being leveraged by the solutions based on legacy software kernels.

There are intrinsically unsafe legacy technologies to be avoided such as interpreted scripts created in VBScript and VBA, and proprietary languages such as C or C++, due to the lack of pointers and memory protection, ActiveX components, COM, DCOM and open TCP/IP sockets.  Preferred technologies include compiled and memory protected languages for scripting, such as C#, VB.NET, Web clients with pure internet technologies (with “security sand-box” or “partial-trust”), WCF (Windows Communication Foundation), Web Services and SQL databases.

2. Project configuration, programming security

Good practices in project configuration and operations include centralization of project configurations in a SQL database or server, allowing distributed secure access, built-in change-management and version control when updating the project, and the ability to remotely run diagnostics, fully test new applications, and update the system without interrupting operations.

The old paradigm in testing a project was made to simply just run the application. Current standards require a higher level of validation during the configuration phase, and the use of specific simulation, profiling and performance analysis tools.

The potential problems caused by a virus or by a random application error are similar. The math of code coverage analysis teaches that it is impossible to ensure reliability by only testing scenarios.  For example, the exhaustive testing of an application of only 10 “IF-Then-Else” decisions should require running 1,024 scenarios.  Therefore the security and reliability validations should be embedded within the architecture, technologies and programming procedures, not by trying to add later by brute force testing or external wrappers. The system should also have a built-in tracking and version management system, so the tool itself automatically logs any configuration change.

3. Deployment, commissioning security

Deployment is one of the most vulnerable areas for security issues, and it has been exploited in many of the recent viruses.

Most systems still in production in the industrial environment are based on software technologies from the 1980s or 1990s, where the SCADA/HMI software packages relied upon hundreds of independent configuration files and communication driver dynamic-link libraries (DLL) for their execution.  An unintended error can easily occur and virus programming requires only basic programming skills at the lowest levels just about “dropping some files” in a folder.

Even when the many individual files themselves may be encrypted or in binary form, there were no ties between those files and their originating project. Anyone could create the extra files in any computer and just drop them into a folder to modify the project behavior. The intelligence to know which alarms to disable or set-points to change to create a threat may be complex, but the virus programming itself requires only basic programming skills, and the risk of accidentally leaving or copying the wrong files is very high. Today’s new generation of tools are based on encrypted structured query language (SQL) read-only files to insure secure deployment of their configurations.

The following is a simple checklist to reduce the threat on legacy systems and help with specification of new systems.

  • Legacy products and installations: The configuration files should be copied to new folders, with a guarantee to be empty, and the Microsoft Windows login permission when running the project should only have read access privileges to those folders. Very critical systems should have an external utility that checks total size and check-sum of the files installed for production, with the files remotely created by the system integration company.
  • Selecting and implementing new solutions: Ideally the whole project configuration should be kept on one deployment file, such as an encrypted read-only SQL database. The version management and change management should rely on built-in features, not external tools or manual procedures.

Another area of vulnerability in legacy systems is the communication protocol driver model. In many systems, the protocol relies on external DLL files created with open toolkits that are easily replaced by other DLLs, which may break the system. In the new-generation tools, the drivers run as isolated processes without direct access to the remainder of the application and have value verifications and read-only deployments to prevent those DLLs from being modified after the commissioning is concluded.

4. Operations, maintenance

Good user security systems, including role and group permissions are already available in most of the previous generation systems. Most systems were also able to comply with regulatory requirements, such as U.S. FDA CFR 21 Part 11.  The new enhanced features for operation and maintenance are to eliminate unsafe runtime components, such as Active-x, adding built-in protection and change-management version control when updating the project and the ability to remotely run diagnostics or update the system without interrupting operations.  It is also crucial to have version control (audit trails) on the projects, as well as the ability to manage multiple versions on the same computer. Data access security should be able to be defined in the graphical displays and at each tag, at the core tag database definition table, in addition to the security of the user interface.

Updating the software tools is a common reason for causing system interruptions that bring unanticipated behavior to a running application.  Modern systems should allow installation of a new project and software tool version without removing the previous one, running the versions side by side at the same time, in the same server, running the execution engine of the previous system and the new update planned, thus being able to run a validation test before applying the changes to the production application.

Renewing the foundations

As previously stated, just as it would be near impossible to take a car created during the 1980s and start adding accessories to make it as safe-driving as the latest models.  That concept also applies to the software infrastructure used in industrial automation applications.  Many systems rely on software foundations created during the 1980s and 1990s. Even though they were great foundations at the time, they can’t measure up to today’s security requirements and can’t leverage much of the new software technologies created in the last decade and evolving almost weekly.

The evolution in software, communications and user interface technologies has accelerated in recent times, therefore in most plants, it is not necessary to renew the entire automation system.  The renewal of SCADA and HMI can bring many immediate advantages, in the system security and in operational stability, reliability and flexibility, providing information optimization, making it possible to create a return of investment to enable the investments that do not rely exclusively on the need for the enhanced security. The expected benefits in renewing the software to a true modern technology is not measured in percentage gains, but on multiplicative factors, preventing potential security issues, which could greatly impact production, and allowing higher efficiency on the assets managed by those systems.

– Marcos Taccolini is the CEO of Tatsoft LLC; edited by Anisa Samarxhiu, digital project manager, Control Engineering, asamarxhiu@cfemedia.com

Tastsoft LLC is a CSIA member as of 7/23/2015

References

[1] Perman, May Robin; Rohde, Kenneth, “Cyber assessment Methods for Control System Security”,  June, 2005, ISA POWID Symposium

[2] Pollet, Jonathan, “Developing a Solid SCADA Security Strategy”, November 2002, Sensors for Industry Conference

[3] Taccolini, Marcos, “Generations of Technology on Supervisory Systems”, August, 2011, ISA Intech South America.

[4] GE, “Understanding and minimizing your HMI/SCADA Security Gaps”, 2011,
https://www.ge-ip.com/library/detail/11924

[5], US Computer Emergency Readiness Team, “Control Systems Security Program (CSSP)”, 2012
https://www.us-cert.gov/control_systems/csstandards.html#plan

[6] US Department of Energy, “21 Steps to improve Cyber Security of Scada Networks”, 2012
https://energy.gov/oe/downloads/21-steps-improve-cyber-security-scada-networks

 [7] Red Target Security, “Security Regulations and Standards for SCADA and Industrial Controls”, 2012
https://isacahouston.org/documents/RedTigerSecurity-NERCCIPandotherframeworks.pd

Key concepts

  • An unintended error can easily occur and virus programming requires only basic programming skills, as it is, at the lowest levels, just about “dropping some files” in a folder.
  • Many systems are not actually employing newer and more secure technologies; rather they create “wrappers” and layers of code and modules around their legacy code and technologies; to protect their investment in same.

Consider this

What steps would you take to protect yourself from a cybersecurity attack?

ONLINE extra

See related articles linked on SCADA and HMI systems below.