Troubleshooting with artificial intelligence

Stan Matwin is using artificial intelligence tools to detect the risk of cascading breakdowns across multiple technologies vital to everyday life.

by Tony Martins

Zap, your power goes out. Pfft , your cell phone network goes along with it. Zzzt, your Internet server shuts down. Clunk, electric-powered gas pumps are not pumping. You’re in the dark, feeling isolated and wondering what might fail next. 

You are experiencing a “cascading impact” systemic breakdown, something to which our communities are increasingly at risk in an age of interconnected technologies that support the day-to-day systems and services we rely on.

How severe are such risks and how can we mitigate them? Stan Matwin and his colleagues are using artificial intelligence (AI) tools to explore these questions, aiming to eventually develop an automated risk management system.

“Our job is to build a system based on AI technologies that will detect dependencies and interrelationships between enmeshed systems and run various failure scenarios to predict what could go wrong,” explains Matwin.

“Enmeshed systems” are prevalent in our current environment, notes Matwin. “There is a lot of implicit interdependency between systems we all rely on,” he explains. “They are tightly interconnected, oft en in a feedback way. If one such system fails, even in part, other systems will be impacted and may fail as well, further deteriorating or knocking down the system that started it all. It’s kind of a snowball effect of failures.”

A professor of computer science at the School of Information Technology and Engineering (SITE), Matwin leads the University of Ottawa contingent on the Bell Canada ARMS initiative sponsored by Defence Research and Development Canada (DRDC). Also contributing to the project from uOttawa is Dr. Jelber Sayyad Shirabad, a senior research associate at SITE.

Although the initial study is preliminary, “the conclusions will be used to bid next year for a much larger project, where the results will be expected to go into deployment,” says Matwin. For a real-world example of a domino effect breakdown in an enmeshed system, Matwin points to the August 2003 power blackout in northeastern North America, in which an estimated 10 million Ontarians and 45 million Americans in eight states were plunged into darkness.

“It seems that not much has been done to fi x the systemic problems,” Matwin says of our integrated North American power grid. “Lack of redundancy is the number one vulnerability; of course, it would cost billions to build in this redundancy.” Redundancy is a key term in the design and risk analysis of interrelated systems. “Critical parts should be duplicated,” says Matwin, “so that when one of them fails, the ‘double’ takes over the function smoothly.”

“Redundancy was built into the computers running the Apollo moon mission, and it never failed,” continues Matwin. And for a more mundane example, “We all practice redundancy driving around with a spare tire....So for enmeshed systems, where so much depends on reliability and uninterrupted operation, redundancy is an important factor.”

But redundancy is not the only area of concern in the study, where findings will include models of the potential damage that could arise from system design flaws. The AI technologies applied will be evaluated based on their ability to model the impact of cascading risk on information, networks, systems and operations. Since detecting system vulnerabilities, risks and interdependencies (“enmeshments”) requires knowledge and understanding, it makes perfect sense to use staple AI methods—knowledge-based techniques, models and knowledge compilation—to detect and analyze such risks and dependencies. 

Research using AI has fascinated Matwin since his days in graduate school. While the field has failed to live up to some promises made by early researchers (e.g., 10 years ago one of the world’s leading AI labs was predicting it would build an intelligence on par with that of a two-year-old child), Matwin is undeterred. In 2009, he published a position paper offering his key insights from the first 50 years of AI as well as five “theses,” or criteria, for effective AI research: it must be practical, embedded, empirical, mathematically sound and scrutinized.

And according to Matwin, the current study aligns with each of these five theses.

“I view our work on this as totally practical, and whatever we do will be embedded in decision-making systems,” Matwin says. “The analysis and warnings that our solution supplies will, of course, be empirically tested. The methods we use are, for the most part, already based on a solid mathematical foundation. And the very role of our solution will be to scrutinize other systems for the drastically negative social effects their failure would bring about.”

The result of the study will be published in a research report that “defines the elements necessary to move towards a proof-of-concept demonstrator,” says Matwin.

In the meantime, Matwin would likely advise us to keep fresh backup batteries for our flashlights on hand, dig out our grandmother’s old-fashioned landline phone and make sure we have plenty of air in our spare tires. 

Back to top