Wednesday, November 19, 2014
Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams
Bibliographical Information:
Bhat, Akshay, and Tracy Hammond. "Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams." In IJCAI, vol. 9, pp. 1395-1400. 2009.
URL:
http://www.aaai.org/ocs/index.php/IJCAI/IJCAI-09/paper/download/592/906
This paper written by the Sketch Recognition Lab identifies text and free-hand diagrams using only one feature: zero-order entropy rate. The paper found that relying solely on this entropy rate yields correct classification between text drawings and diagram sketches 92.06% of the time. This feature also performed favorably in domains for which no training examples were provided.
The main motivation behind this paper is the fact that many sources of sketches introduce both hand-written text as well as shape-based diagrams. These examples include engineering diagrams where the author writes annotations and labels components with text, military course of action diagrams, and UML sketches have both text and shapes as considerable components in their respective sketches. Separating what is a sketch and what is hand-drawn text is necessary for accurate sketch recognition in both domains. Despite the fact that recognition of both types of sketches vary greatly in procedure and final classification, these two types are found frequently in traditional sketching, providing a strong motivation to be able to separate them successfully.
This paper uses zero-order "entropy rate" to classify sketches between shapes and text. Entropy measure is essentially a measure of "the degree of randomness from an information source". With regards to sketch drawing, this randomness appears with far higher frequency with text sketches rather than sketch shapes. The paper develops an entropy model "alphabet" to characterize how these shapes tend to be drawn. This essentially means that the range of angles of any stroke are divided into discrete thresholds and assigned a symbol as a classification (in this case, the range from 0 to pi/6, for instance, is given the symbol "A". Different ranges correspond uniquely to B, C, D, E, F, and a range corresponding to an "end-point" is given the symbol X.
With this classification, the strokes are grouped and an "alphabet" of each of their angles is shown. The representation of text strokes yield more varied letters than shape strokes, and that is what is used as the "entropy" measure in this instance. The strokes were labeled as "shape", "text", or "unclassified", the last of which is the classification where the entropy measurement is above the threshold to classify as a "shape", but below the threshold of "text". The results are promising, with 92.06% correct classification when the system is required to produce a classification. Military course-of-action diagrams and other types of sketches that haven't been assigned a "domain" and therefore not studied performed favorably as well.
I think that this paper is both well motivated and implemented, with the idea of "entropy" a particularly clever one that ends up yielding very favorable results. I think that the paper's explanation in how the classification worked was easy to understand and well elaborated, with a surprising amount of intuition applied to the general back-end of how this classification technique works. Using higher-order entropy models to determine accuracy that way would be an interesting continuation of this study.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment