CMU-CS-21-146
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-21-146

Crowd-Sourced Evaluation of Explainable AI Techniques with Games

Mayank Jain

M.S. Thesis

December 2021

CMU-CS-21-146.pdf


Keywords: XAI, Machine Learning, LIME, Grad-CAM, GWAP

Image Classification is a fairly mature domain in Machine Learning (ML) today. All the way from the automobile industry to retail supply chains, image recognition and classification enable industry processes everywhere. The one big drawback when it comes to using ML in a lot of industries is the black-box nature of ML algorithms. Historically, it's been almost impossible to figure out why a neural net classifies a particular image as something.

On the other hand, Explainable AI (XAI) is an emerging domain in ML that aims to give people more insight into why an ML algorithm does something particular. This allows for more transparency into AI-made decisions, in turn allowing them to enter industries like healthcare and criminal justice, where a black box with 99% accuracy is just not enough. In recent times, a lot of XAI techniques have been proposed to help explain the image classification problem in specific, but few have been evaluated beyond anecdotal evidence. It usually just comes down to the authors saying that the explanations "look good". Many of these XAI techniques are designed for people with the intuition of a data scientist or ML engineer, with very few ways to evaluate them for non-experts.

In this work, we present a novel method for human evaluation of XAI techniques. We do this via a Game With a Purpose (GWAP) called Eye into AI that will allow researchers to crowd-source human evaluations of XAI techniques focussed on ex- plaining deep learning models trained for image classification. In addition, we use this game to evaluate LIME, Grad-CAM, and Feature Visualizations, the first evaluation of its kind. We find that our game is able to provide a clear ranking of these XAI techniques, and provide meaningful insights into the kind of use cases they would each be most useful in.

41 pages

Thesis Committee:
Adam Perer (Chair)
Kenneth J. Holstein

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu