Saturday, November 15, 2014

Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes


Bibliographical information:
Jacob O. Wobbrock, Andrew D. Wilson, and Yang Li. 2007. Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes. In Proceedings of the 20th annual ACM symposium on User interface software and technology (UIST '07). ACM, New York, NY, USA, 159-168. DOI=10.1145/1294211.1294238 http://doi.acm.org.lib-ezproxy.tamu.edu:2048/10.1145/1294211.1294238

URL:
http://dl.acm.org.lib-ezproxy.tamu.edu:2048/citation.cfm?id=1294238

The $1 recognizer was designed to be small, fast, and easily implementable with under 100 lines of code. With only 1 loaded template, $1 had 97% accuracy, and with 3+ loaded templates achieved as high as 99%. Although this paper is one of the most referenced in the field of sketch recognition, at the time of publication none of the algorithms involved were particularly novel. However, due to the fact that nobody had ever actually published this information in a paper, this paper was in fact the first to fully document this well-known recognition technique. It can be credited with standardizing this procedure as well as providing a base from which other researchers developed recognizers such as $2 and $n.

The algorithm works as follows. The sketches are divided into two sets of points: the "candidate points" C given by the user sketch, and "template points" T that matches the candidate points most. The candidate points are sampled by a combination of hardware and software. Because a simple template comparison would rarely provide a match, there are several more steps that must be taken to successfully make such a comparison:

  1. Resampling the point path. The paper describes that the number of samples for each shape to resampled to varies in optimal use depending on the shape, but as long as N number of samples is between 32<N<256, the algorithm can recognize the best. The optimal number of sample points was found to be N=64
  2. Rotate Once based on the Indicative Angle. The recognizer searches "over the space of possible angles" to identify the optimal angle through which both the candidate and template shapes will be compared against.
  3. Scale and Translate. The candidate shape is rotated about its centroid based on the "reference square" that gets determined. The shape is then translated to a reference point to match the two shapes as closely as possible.
  4. Find the Optimal Angle for the Best Score. The "score" is determined based on the distance between the template and corresponding candidate points, and the size of the points between corresponding template and candidate points as well.
The paper identifies several limitations of this recognizer. First, because the recognizer is meant to be flexible with regards to rotation and scale, any shapes whose identifying features include a specific rotation (such as the difference between a rhombus and a square only being the rotation of the shape) will not be considered a separate shape. Non-uniform scaling can also abuse horizontal or vertical line, so shapes purely consisting of one of these lines can have trouble being recognized. Third, because the system does not have any time-based recognition, any shapes that rely on drawing time or speed will not be recognized.

The $1 recognizer, as the paper initially claims, is used today mostly to get a recognizer easily integrated into a system. This recognizer can be used easily with simple shapes such as the letters of the alphabet. However, due to the significant limitations of this particular algorithm, any significant research in sketch recognition and most modern formats do not use something as simple as $1 by itself. Most modern systems use a combination of these along with several other kinds of algorithms such as feature-based ones, to generate the most accurate and useful recognition abilities for specific applications.

No comments:

Post a Comment