Why manufacturing software should be tested before updates
Engineering and IT Insight: The way Microsoft is updating its Microsoft Windows 10 operating system, the ball is out of the user’s court. Instead, Microsoft has installed an automated ball machine that fires when it wants, even if the user isn’t ready. This new update scheme may wreak havoc for many mission-critical systems. Read this to know “why I should test” and why “Disable automatic updates.”
It is vitally important to test any updates or changes before committing them to production use, and there can be consequences for not completely testing.
There have been so many reports of an untested update crashing a critical system that it is common sense to test before committing an update. The ISA Industrial Automation Cyber System Security standards include a 70-page technical report (ISA-TR62443-2-3, Patch Management in the IACS Environment) on patch management and the need to test patches in control systems.
The question is usually not should you test, but how much should you test. The typical manufacturing application relies on tens to hundreds of underlying libraries, processes, services, and operating system elements.
Often, the entire list of required elements is not known, and it's rare that software vendors are required to tell which operating system version and critical SQL, Microsoft .NET, and Java library versions are needed. The typical manufacturing facility uses about 50 to 100 applications, ranging from simple spreadsheets to a multimillion-dollar distributed control system (DCS) and manufacturing execution system (MES).
Each of these applications relies on tens to hundreds of underlying elements. Clearly, there is a lot to test in an update. The problem came not from failing to test the big applications, but from failing to test the small applications, especially those that are not mission-critical per se, but are ones that are important and were always assumed to work.
This isn't your father's OS
The problem occurred on the Microsoft Windows 10 Version 1511 update. With Microsoft Windows 7, 8, and 8.1 it was easy to setup a system so it wouldn't automatically update. Windows 10 changed the pattern and not for the better. Microsoft considers Windows 10 a service, so they will decide when to update, not the user. The concept of "patch Tuesday," when all patches were released on the second Tuesday of the month, is also gone. Patches can be pushed out at any time. The Windows 10 Version 1511 update was massive at 3 GB and seemed to touch almost every part of the system.
Some Windows 10 editions will allow updates to be deferred for several months, excluding security updates, though these will still install. There are various tricks that may be used to defer or stop updates on systems that should not be changed, but there is no reason to believe that they are permanent fixes. For Windows 10 Pro and Enterprise editions, a group policy can be set that will send notifications about new updates without automatically installing, although security updates will still install automatically. No one in the user community has discovered that "stop updating" solutions provide a long term answer, which is bad news for mission-critical manufacturing systems. However, even security updates, which can't be easily stopped, can cause problems.
On Dec. 30, 2015, Microsoft pushed out a security patch that disabled Skype, HP scanner software, and various other systems. A security release in June 2015 disabled some Nvidia graphic card drivers and multi-monitor support.
Importance of testing
Microsoft Windows 10 is used for office applications, document management, project management, program development, test machines, and other applications. The systems are setup to delay installing the updates, or delay rebooting to install the downloaded updates. We carefully tested the 1511 update on a few systems and ran quick "confidence" tests to make sure that everything worked. These were quicker and less comprehensive than the full set of application tests used when Microsoft Windows 10 was first installed. After all, it was just supposed to be a patch, and the update message read "All your files are exactly where you left them."
After the test systems correctly ran the confidence tests, we allowed the other machines to be updated. That is when we discovered that the confidence tests didn't cover the "small" applications and all of the supporting services. The update changed file associations, removed shortcuts, removed applications from the start menu, changed the printer options, even crashed Microsoft Windows Explorer on one machine, and caused a set of problems that collectively took days to resolve.
How much was tested, critical lesson
All of the problems seen (except for the Windows Explorer crash) were elements tested in the initial install of the applications and on major system updates. Unfortunately, this 3 GB patch was considered a minor update. Fortunately, the systems it affected were not mission-critical, just annoyingly difficult to fix.
The lesson learned is, if you are using Microsoft Windows 10 in a manufacturing environment for mission-critical, or even just important, systems, automatic updates must be disabled. This requires using the Microsoft Windows 10 Enterprise edition, using group policies, and Windows Server Update Services (WSUS). WSUS allows full control over the internal distribution of updates using existing management solutions such as System Center Configuration Manager. Check with your vendors to ensure that they have tested their Windows 10 systems on the Enterprise edition. The most important point, however, is to ensure that the operations group controls the updates for the mission-critical, and mission-important, systems. These systems should never be updated using the same rules as the business systems.
Patch management advice
Additional information on ISA-TR62443-2-3, Patch Management in the IACS Environment: The technical report covers the installation of patches, software updates, software upgrades, firmware upgrades, service packs, hotfixes, basic input output system (BIOS) updates, and other digital electronic program updates. While focused on security-related patches, the recommended practices cover all patches to Industrial Automation and Control Systems (IACS). It defines recommended practices in patch management for asset owners, system integrators, and IACS product suppliers. It also describes the effect of poor patch management on the reliability and operability of an IACS.
Dennis Brandl is president of BR&L Consulting in Cary, N.C., www.brlconsulting.com. His firm focuses on manufacturing IT. Contact him at firstname.lastname@example.org. Edited by Mark T. Hoske, content manager, Control Engineering, CFE Media, email@example.com.
Microsoft Windows 10 updates can be deferred, but not security patches.
Patches can have unintended consequences to mission-critical systems without testing.
Disable automatic updates to allow testing.
OS updates are becoming more and more cumbersome and in some cases are being forced upon the user. How will you ensure mission-critical systems are not compromised?
This posted version contains "Patch management advice" and other information that appears in the print/digital edition issue of Control Engineering.
See other Manufacturing IT articles.