Programming: ‘Self-healing’ software diagnoses own problems

By Control Engineering Staff October 12, 2006

In what is no surprise to programmers, IT analyst firm Enterprise Management Associates estimates that determining the cause of a software glitch can take 50 to 80% of an IT staff’s time, while only 15 to 20% of their time is spent repairing it. To address that discrepancy, IBM has announced new software that helps developers and solution providers build self-healing capabilities into their applications, features that could save programmers up to 80% of time previously spent resolving issues manually. Created through the collaboration of IBM research and development laboratories in India, Japan, Toronto, and the U.S., the software helps recognize warning signs to head off system crashes and performance bottlenecks.

The software, based on open industry standards, helps developers capture and pinpoint the root cause of problems, allowing them to create a customized catalog of problem symptoms so they can be fixed based on historical knowledge. This symptoms catalog is essentially an automated “cheat-sheet” that operations staff can use if these problems come up when deploying and running the application, saving time and money. Additional symptoms and solutions can be added as new knowledge on the causes of problems is learned, continuously making the catalog more far-reaching and useful.

The software is part of the IBM Build to Manage Toolkit for Problem Determination, which also contains tools, tutorials, and support to help developers quickly build problem determination management capabilities into their applications, without being management experts. Problem determination components found in the toolkit are drawn from IBM’s Tivoli, WebSphere, and Rational software portfolios.

The toolkit is based on the Oasis Web Services Distributed Management Event Format (WSDM WEF) industry standard. In addition to making technology easier to manage, WSDM helps companies build out service oriented architecture, or SOA, which is a way of reusing a company’s existing technology to more closely align with business goals, resulting in greater efficiencies, cost savings, and productivity. IBM has contributed several components of the toolkit—including the new symptom catalog authoring tools and WSDM WEF software libraries—to the Eclipse Test and Performance Tools Platform and the Apache Muse open source project.

The software is part of IBM’s cross-industry autonomic computing initiative, which has worked over the past five years to radically simplify IT management, and the underlying infrastructure, by automating processes and building intelligence into systems themselves. More than 475 self-managing autonomic features are in 75 IBM products. IBM’s Autonomic Computing Technology Center in Yamato, Japan helped pioneer the development of the Build to Manage Toolkit for Problem Determination. Opened in July 2005, the Yamato center employs 50 engineers and is part of IBM’s network of 55 Software Development and Research Labs worldwide.

According to Akira Bannai, Chief Fellow of Toshiba Solutions Corp., “We believe IBM’s new software enables a variety of hardware, operating systems, and software to easily adapt to autonomic computing based problem determination technology. Toshiba Solutions’ cluster software, ClusterPerfect EX, now supports Common Base Event and symptom database technology, providing both high availability and quick problem determination capability together with our system management solutions.”

The IBM Build to Manage Toolkit for Problem Determination will be available fourth quarter 2006.