Programming: ‘Self-healing’ software diagnoses own problems


In what is no surprise to programmers, IT analyst firm Enterprise Management Associates estimates that determining the cause of a software glitch can take 50 to 80% of an IT staff's time, while only 15 to 20% of their time is spent repairing it. To address that discrepancy, IBM has announced new software that helps developers and solution providers build self-healing capabilities into their applications, features that could save programmers up to 80% of time previously spent resolving issues manually. Created through the collaboration of IBM research and development laboratories in India, Japan, Toronto, and the U.S., the software helps recognize warning signs to head off system crashes and performance bottlenecks.

The software, based on open industry standards, helps developers capture and pinpoint the root cause of problems, allowing them to create a customized catalog of problem symptoms so they can be fixed based on historical knowledge. This symptoms catalog is essentially an automated "cheat-sheet" that operations staff can use if these problems come up when deploying and running the application, saving time and money. Additional symptoms and solutions can be added as new knowledge on the causes of problems is learned, continuously making the catalog more far-reaching and useful.

The software is part of the IBM Build to Manage Toolkit for Problem Determination, which also contains tools, tutorials, and support to help developers quickly build problem determination management capabilities into their applications, without being management experts. Problem determination components found in the toolkit are drawn from IBM's Tivoli, WebSphere, and Rational software portfolios.

The toolkit is based on the Oasis Web Services Distributed Management Event Format (WSDM WEF) industry standard. In addition to making technology easier to manage, WSDM helps companies build out service oriented architecture, or SOA, which is a way of reusing a company's existing technology to more closely align with business goals, resulting in greater efficiencies, cost savings, and productivity. IBM has contributed several components of the toolkit—including the new symptom catalog authoring tools and WSDM WEF software libraries—to the Eclipse Test and Performance Tools Platform and the Apache Muse open source project.

The software is part of IBM's cross-industry autonomic computing initiative, which has worked over the past five years to radically simplify IT management, and the underlying infrastructure, by automating processes and building intelligence into systems themselves. More than 475 self-managing autonomic features are in 75 IBM products. IBM's Autonomic Computing Technology Center in Yamato, Japan helped pioneer the development of the Build to Manage Toolkit for Problem Determination. Opened in July 2005, the Yamato center employs 50 engineers and is part of IBM's network of 55 Software Development and Research Labs worldwide.

According to Akira Bannai, Chief Fellow of Toshiba Solutions Corp., "We believe IBM's new software enables a variety of hardware, operating systems, and software to easily adapt to autonomic computing based problem determination technology. Toshiba Solutions' cluster software, ClusterPerfect EX, now supports Common Base Event and symptom database technology, providing both high availability and quick problem determination capability together with our system management solutions."

The IBM Build to Manage Toolkit for Problem Determination will be available fourth quarter 2006.

No comments
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by...
The System Integrator Giants program lists the top 100 system integrators among companies listed in CFE Media's Global System Integrator Database.
Each year, a panel of Control Engineering and Plant Engineering editors and industry expert judges select the System Integrator of the Year Award winners in three categories.
This eGuide illustrates solutions, applications and benefits of machine vision systems.
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
This eGuide contains a series of articles and videos that considers theoretical and practical; immediate needs and a look into the future.
Additive manufacturing benefits; HMI and sensor tips; System integrator advice; Innovations from the industry
Robotic safety, collaboration, standards; DCS migration tips; IT/OT convergence; 2017 Control Engineering Salary and Career Survey
Integrated mobility; Artificial intelligence; Predictive motion control; Sensors and control system inputs; Asset Management; Cybersecurity
Featured articles highlight technologies that enable the Industrial Internet of Things, IIoT-related products and strategies to get data more easily to the user.
This article collection contains several articles on how automation and controls are helping human-machine interface (HMI) hardware and software advance.
This digital report will explore several aspects of how IIoT will transform manufacturing in the coming years.

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

Infrastructure for natural gas expansion; Artificial lift methods; Disruptive technology and fugitive gas emissions
Mobility as the means to offshore innovation; Preventing another Deepwater Horizon; ROVs as subsea robots; SCADA and the radio spectrum
Future of oil and gas projects; Reservoir models; The importance of SCADA to oil and gas
Automation Engineer; Wood Group
System Integrator; Cross Integrated Systems Group
Jose S. Vasquez, Jr.
Fire & Life Safety Engineer; Technip USA Inc.
This course focuses on climate analysis, appropriateness of cooling system selection, and combining cooling systems.
This course will help identify and reveal electrical hazards and identify the solutions to implementing and maintaining a safe work environment.
This course explains how maintaining power and communication systems through emergency power-generation systems is critical.
click me