Throwback Attack: Lessons from the Aurora vulnerability
A government-sponsored test on whether a cyberattack could inflict real-world physical damage has had major repercussions to this day. See eight steps on how to mitigate a potential cyberattack against your industrial control systems (ICSs).
Lessons can be learned from prior cybersecurity efforts, even older ones, as this 2007 demonstration showed. Are you aware of the eight ways to mitigate the Aurora vulnerability?
In 2007, the Department of Homeland Security, working with the Idaho National Laboratory, undertook to demonstrate that a cyberattack could, in fact, cause real-world physical damage. It had already been known that cyberattacks could destroy computer equipment by creating anomalous behavior in hard drives and by overclocking microprocessors; the goal of this test was to determine if the manipulation of various control components could damage or destroy large infrastructure, in this case a 2.25MW, 27-ton diesel generator. The test was a success as it proved it could be done. However, it also determined the vulnerability it revealed could be difficult to mitigate. The test was code named “Aurora” and is known as the “Aurora vulnerability”.
Cybersecurity implications of power generation 101
Power generation is a complex, yet routine operation. A typical generator rotates powerful magnets through heavy copper coils to induce current; the rotating force that turns the armature or the field, as the case may be, can be an internal combustion engine, a steam turbine or a water wheel. The power is then distributed through the power grid to homes and businesses. At either end of the distribution system are circuit breakers. The purpose of these circuit breakers is to protect the wiring and equipment connected to the grid and to protect the grid.
Generators for large installations use circuit breakers controlled by a system of protective relays. Protective relays are a science unto themselves; they monitor what is happening on the wiring connected to the generator and protect it from damage from anomalous conditions on the grid or in the generator.
Among those conditions are overcurrent, undercurrent, undervoltage, phase imbalance, loss of synchronism and ground fault. Each relay has an American National Standards Institute (ANSI) designated device number and must be periodically checked and calibrated to avoid damage to the generator or to the distribution system. When a protective relay senses a condition it is designed to monitor, the relay trips the circuit breaker and disconnects the generator or connected equipment from the line.
The point of these protective systems is to keep the power grid stable and operating by detaching faulted devices from the system, leaving as much of the distribution system operating as possible. In the United States and the rest of North America, the power grid operates at 60 Hz. For any power grid to function properly, all the generating plants connected to it must be synchronized by voltage, phase and frequency.
If any of the generators are out of synchronization, there will be an imbalance in the system, which could damage a generator, the distribution system or other connected equipment. If a generator were to fall out of synchronization, the phase imbalance protective (synchronism) relays would trip and disconnect the generator from the line. If this did not happen, then the tremendous force of the system would attempt to resynchronize the machine.
The generator’s electrical and physical characteristics would naturally resist this, and large torques would be exerted on the driving shaft and high currents produced in the generator windings. These current spikes can and will damage other equipment, such as transformers and motors.
Cybersecurity throwback: The Aurora generator test description
Connecting an unsynchronized generator to a working power grid is dangerous and will lead to a damaged, if not destroyed, generator. What if a threat actor with knowledge of the workings of generating systems were to gain access to protective systems, either physically or virtually? This was the basis for a proof-of-concept test done at the sprawling Idaho National Laboratories, which is run by the Department of Energy.
The lab has its own large power grid and generating capacity it uses to test small nuclear power systems used in submarines, ships and spacecraft. A large diesel generator was acquired from surplus, and a facility was built to house it. A new substation was built, replicating those seen in common practice and including the protective relay systems often used in this type of installation.
To facilitate the main goal of the test, vibration monitoring, overspeed and synchronism trips were disabled. The goal was to produce what is called out-of-phase synchronization (OOPS) by opening and closing the generator’s circuit breaker while it was running and connected to the grid. Protective systems are designed to isolate a malfunctioning generator. For the test, the synchronism and phase imbalance protective relays were reprogrammed to randomly open and close the generator breaker.
Prior to initiating the test, the generator was synchronized to the grid and operating as expected. Upon commencement of the test, the protective systems first checked for synchronism; then they disconnected the generator from the grid. Being unloaded, the generator naturally accelerated. Then, the circuit breaker was closed again, tying the generator back into the grid. The generator was violently thrown back into synchronization by the overwhelming force of the grid; the force of the other connected generators and devices pulled the small mass of the test generator back to synchronism with the 60 Hz grid frequency.
The malicious code that had corrupted the protective system’s functions was less than 130 kb, which is about 30 lines of code. The opening and closing of the generator breaker had lasted for only a few microseconds, or about 15 cycles. The code was executed three more times. Each time, the generator was observed to violently jolt and shake; after the second hit, pieces of the machine began to fly off. The large rubber connector that joins the generator to the engine was rapidly deteriorating from the extreme torque exerted on the machine. The generator began to smoke. Its windings had begun to fuse and melt from the high current spikes that accompanied the OOPs.
On the third and fourth execution of the code, the engine and generator essentially tore themselves apart. A postmortem revealed the generator windings were melted and burned. The engine shaft had twisted and struck the inside of the crankcase – the generator was scrap metal. The test lasted three minutes, and the generator was destroyed in under a minute.
The test video is available here: https://www.youtube.com/watch?v=bAWU5aMyAAo
Methods of cyberattack
Connecting a generating source to the electric grid requires frequency, voltage and phase rotation to be matched for a proper and safe connection to the grid. Protective relays monitor each of these parameters to ensure a successful connection; if any of these parameters exceed tolerance, the machine is disconnected to prevent damage to the machine or the grid.
To compromise these systems, an attacker would have to breach several layers of security and have a good working knowledge of the system to target the correct breaker. The attacker would also need physical access to the substation or to have compromised the communications systems connecting the protective relays to the supervisory control and data acquisition (SCADA) system. The various alarms would also need to be disabled so as not to alert operators to the problem. The several layers of security that need to be breached would ideally be password-protected at each level. Password protection is a common oversight and vulnerability.
The root cause of the Aurora vulnerability is poor physical security and poor cybersecurity. Designers and operators must consider cybersecurity at the outset and plan for all possible modes of attack, including physical attacks on the facility. The aurora vulnerability, if not mitigated, can extensively damage much of the equipment connected to the grid, and could cause extended power outages. The attack does not have to happen at the generator or the substation; it can be initiated from anywhere.
As higher value options are hardening their defenses, critical infrastructure and industrial control systems (ICSs) are becoming prime targets. Most attacks are remotely conducted. However, poorly-secured facilities also can provide opportunity for on-the-ground physical attacks.
Direct hacking is a mode of attack that physically accesses protective relay systems and reprograms the devices to affect the anomaly – directly hacking the protective relay. This requires physical access and both power system and hacking knowledge. This attack implies it could be performed by an insider or someone who has breached the physical security of the facility.
Anyone with physical access to the substation could open and close the breaker manually and achieve the same result – manual switching bypasses any automatic control or protective systems. This attack falls under the “disgruntled employee” or malicious vandalism category.
Compromised communication channels are a common access method for remotely attacking a wide variety of control systems. In fact, it is the most common attack vector given the physical location of most threat actors is offshore. As with any other attack, the chief culprits for allowing a successful breach are poor cyberhygiene, poor password policy and poor network architecture and protection. The human element also plays a large part in the mitigation of this risk and should not be ignored.
A third, and increasingly common attack vector, is infiltrating the supply chain. If an attacker can access the protective systems during manufacturing or at any point prior to installation, embedded code can be injected into the device that will trigger on a specific date or event. This attack has been seen in recent events where software integrity has been compromised while in the supply chain between vendor and end users (SolarWinds, for example).
Eight ways to mitigate the Aurora vulnerability
This will sound like a common refrain, but mitigating this vulnerability is similar, if not identical, to protecting any other ICS. These measures require an investment in time and money. If executed properly, they can make a facility practically impregnable. Levels of defensive measures, referred to as defense in depth, frustrate determined attackers by physically blocking, or obfuscating, misdirecting and blocking their efforts. Eventually, these bad actors will give up and move on to easier targets.
By following proper security measures, the Aurora vulnerability can be mitigated. These eight measures are a good baseline.
- Audit communication systems. It is important to know how the control network is set up and where any possible breaches can occur. Think like a hacker – they operate like burglars who walk down the street and check doorknobs – and shut down any unused ports or extraneous communication channels. The point of the audit is to determine which systems and which staff have access to critical systems communication networks, including SCADA. Know what communication channels are actually in use and which can be eliminated.
- Institute algorithms that monitor and supervise protective relay and breaker operation. Unusual opening and closing of the relays or breaker may follow a recognizable pattern and be detected and mitigated before an attack is executed.
- Encrypt and protect communication channels. Use a firewall with virtual private network (VPN) capability for any outside access requirements. Institute a secure and encrypted (and unadvertised) backup communications channel for use if the primary channel is compromised.
- Eliminate any cross connections with office or corporate networks. There should be no connection between SCADA or energy management system networks and the facility’s office network, which is likely connected to the internet. This is a grave vulnerability because of attacks that start with a “phishing” email; 85% of all attacks start with a phishing email. Also, attacks can be “inside jobs” by malicious or disgruntled employees.
- Password policies should be established and enforced. Change the default passwords on the protective relays. Use long and strong passwords and hierarchical access controls. Require periodic password changes. Use multifactor authentication (MFA) for critical system access. Treat each system as a unique security domain, and do not use the same password for all systems.
- Institute a policy of least privilege for all staff to limit access to critical systems. Consider schematics, product manuals, diagrams, flowcharts and any other detailed system information as confidential and limit access to those staff on a need-to-know basis. Compartmentalize system knowledge and the security methods used to secure each domain.
- Check incoming equipment against the vendor’s specifications. This helps users determine if a supply chain attack has occurred. Work with the vendor to institute methods that can determine if the device or software was tampered with between the factory and the customer.
- Audit and strengthen physical security. Threat actors who can infiltrate a facility and physically access equipment can perpetrate an enormous amount of damage.
The weakest link in any cybersecurity scheme is the human element. Automating as much of the process as possible, including the starting, synching and connecting of generating equipment can easily be automated, and the initiation of these processes can be substantially automated. Modern protective systems perform their functions well with reliability levels exceeding a human’s – they are not distracted or annoyed or aggrieved, and they work 24/7 without complaint.
Cautionary tales of cybersecurity breaches
In 2009, the first purpose-built digital weapon was used to destroy a third of the uranium enrichment centrifuges at the air-gapped Natanz laboratory in Iran. The Stuxnet worm, developed by the NSA and Israeli cyber warriors, was smuggled into the facility on a contractor’s laptop. The worm infected the centrifuge control systems by specifically targeting the programmable logic controllers (PLCs) that controlled them. This was the first known use of a digital weapon to destroy physical equipment in the real world.
In 2016, Russia’s GRU military intelligence agency perpetrated an attack on the Ukrainian power grid. That attack started with a phishing email, which unleashed a script that quickly compromised the grid, mostly through unsecured or poorly secured communications channels. The attack caused widespread outages and collateral damage. One often-overlooked item was the destruction caused by the attack: The worm targeted key pieces of equipment such as PLCs and PCs used for process control and power generation. Several generators were damaged or destroyed using Aurora-type attacks; transformers and substations were damaged using similar techniques.
Learn from cybersecurity mistakes, demonstrations
The Aurora vulnerability sent shockwaves throughout the cybersecurity and power industries when it was first inadvertently revealed in 2009 in a Freedom of Information Act (FOIA) request regarding a different program that happened to be called Project Aurora.
The vulnerability can be mitigated, and much progress has been made since 2007 in the protection of critical systems. However, many legacy systems still exist, and operators succumb to the belief that this is a solution in search of a problem. The problem exists, and the means and methods for preventing it also exist. With the chaos that a large and prolonged blackout would produce, the issue requires a sober examination of the facts and for responsible parties to act.
Daniel E. Capano is senior project manager, Gannett Fleming Engineers and Architects, a CFE Media content partner and is on the Control Engineering Editorial Advisory Board. Edited by Chris Vavra, web content manager, Control Engineering, CFE Media, email@example.com.