ChartSense: Interactive Data Extraction from Chart Images

Charts are commonly used to present data in digital documents such as web pages, research papers, or presentation slides. When the underlying data is not available, it is necessary to extract the data from a chart image to utilize the data for further analysis or improve the chart for more accurate perception. In this paper, we present ChartSense, an interactive chart data extraction system. ChartSense first determines the chart type of a given chart image using a deep learning based classifier, and then extracts underlying data from the chart image using semiautomatic, interactive extraction algorithms optimized for each chart type. To evaluate chart type classification accuracy, we compared ChartSense with ReVision, a system with the state-of-the-art chart type classifier. We found that ChartSense was more accurate than ReVision. In addition, to evaluate data extraction performance, we conducted a user study, comparing ChartSense with WebPlotDigitizer, one of the most effective chart data extraction tools among publicly accessible ones. Our results showed that ChartSense was better than WebPlotDigitizer in terms of task completion time, error rate, and subjective preference.

Focus: Tool
Source: CHI 2017
Redability: Expert
Type: PDF Article
Open Source: Yes
Keywords: Chart recognition, Data extraction, Chart classification, Deep learning, Mixed-initiative interaction
Learn Tags: Data Tools Design/Methods Machine Learning
Summary: This paper presents ChartSense, an interactive chart data extraction system that determines the chart type of a given chart image using a deep learning based classifier and then extracts underlying data from the chart image using extraction algorithms optimized for each chart type.