CMU-CS-23-141 Computer Science Department School of Computer Science, Carnegie Mellon University
Accessible Descriptions for Surprising Clusters in Scatterplots Ihita Mandal M.S. Thesis December 2023
CMU-CS-23-141.pdf
Addressing the inaccessibility of data visualization is growing in prominence and importance for those who are working with creating effective visualizations. Some attempts have been made to improve resources for practitioners or build more accessible visualization tooling or techniques, for example through screen readers. Despite these attempts, fundamental issues remain for addressing the growing importance of inaccessibility. To address this issue, the most common technique is to use NLP and machine learning to automatically describe charts or produce analytical insights. While this growing body of automated description work holds promise, there are still many outstanding issues in this process. In this project, I aim to provide insight into certain aspects of a chart, particularly clusters, that may lend themselves to a useful description. Specifically, I provide details on extracting information about surprising clusters from scatterplot charts, which tend to be a popular form of representing data, that deviate from the overall trend of the points in the chart. My proposed system outputs a description outlining the general trend of a chart, as well as the specific location of a cluster in the chart, providing a novel approach and an improvement from other existing forms of automatically generated descriptions. I also analyze how to effectively represent such data in a description, based on various factors such as cluster location and the proximity of the points within the cluster, as well as propose potential improvements to this system over areas where it currently does not perform well. 50 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |