Raniero's Research Blog: September 2014

Tuesday, September 30, 2014

Sketch System Principles and Mechanix Feedback

What a Sketch System Should Do:

Pressure-based line thickness. Though this is a recent development and is highly dependent on hardware, it greatly enhances the feeling of
Use a stylus as its primary input method. Most touch systems of today use finger touch, but finger paint is an extremely limited subset of all kinds of paper sketching/painting. Stylus sketching is far more reflective of the type of sketching most people do on paper
Beautify strokes to appear as seamless as possible. Work to remove any jarring angular changes in sketches resulting from the touchscreen hardware. The more pixelated a sketch looks, the more it breaks the immersion of writing as one would on paper
Stroke deletion. Ensure that the system supports deleting entire strokes at a time. Although this does not reflects the realism of writing on actual paper, deleting entire strokes at a time is extremely convenient for anything other than drawing applications. It should at least be an option
Zoom in/out, translate. Since touch surfaces are typically smaller than regular pieces of paper (especially true of tablets), the user should be able to zoom in, out, and translate the page across a zoomed in version to better suit the user's preferred comfort of sketch size.
Have a non-intrusive interface. Interfaces in sketching is unlike any other kind of system. Features cannot be buried under menus, nor can every feature be listed at once. Features should be smartly presented and removed in such a way that it interferes as little as possible with the actual sketching
As big a sketch area as possible. Sketch areas should not stop strictly at the edge of the window. They should be easily and intuitively be resized. Avoid using the "Windows resize arrow" style of resizing, as it does not work well with a stylus or touchscreen.
Have a grid, unless it is a drawing application. Most every piece of paper, from regular writing to architecture sketching to mathematical formulas, all have notebook lines or grids to aid the user's writing task. Completely blank pages are only suitable for art drawing and should only be default if the application is intended for that domain
Provide ample amounts of stroke size and color options. Users are more creative when given stroke size and color choices, and frequently aid them when sketching for homework or productivity tasks. They are both fun in recreational use and useful in productivity tasks.
Simulate the on-paper experience as much as possible. The most successful sketch systems are the ones where a user acclimated to the system "forgets" he or she is writing on a digital surface when sketching or writing. All sketching system should strive for this kind of user experience.

What a Sketch System Should Not Do:

Aggressively auto-correct. While some beautification is expected, snapping lines to make them perfectly straight and at predetermined angles 100% of the time is jarring and removes a user's sense of agency.
Be laggy. A sketch system should have little to no lag between the user input and the sketch showing. Significant lag can break immersion at best and render the entire application useless at worst. A lag-less system is vital to capturing the illusion of writing on a regular piece of paper
Be "heavy". "Heavy" systems, whether in terms of computing power, or in terms of file size when saving sketches, is vastly inefficient and adds a layer of technology-borne burden that otherwise would not exist on regular paper.
Write or preserve a "mouse-based" interface. Do not assume that users have access to a mouse to complement their sketching needs, and thus do not use small interface items optimized for mice. Assume that the user will want to reach every last item of the UI through stylus or finger only.
Be strict in sketch order. Especially with sketches that are to be analyzed by a system, expecting users to know the order in which sketches should be drawn is highly unintuitive and should be completely avoided.
Fault the user for misinterpreted shapes. Nearly everyone is sure that the shapes they draw are "correct". If a stroke is misinterpreted, do not say "you have drawn this incorrectly", or something to that effect, as it is overly antagonistic and isn't in the spirit of free-hand sketching
Have the user spend more time on the UI than sketching. Digitized applications can always come with a plethora of options and features, but with sketching, less is more. Sketch systems should not see their users spend more time navigating and "figuring out" the UI than actually sketching.
Mix stylus with touch sketching at the same time. If the user is sketching with a stylus, disable all touch-based input and vice versa. Most people rest their hands on the paper when writing, so they do not expect the side of their hand to actually be involved with the writing process.
Use pre-determined shapes as a primary method of sketching. Paper sketching does not afford dragging and dropping items as a primary input method. These tasks are far more suited for more traditional, mouse-based systems.
Use non-paper backgrounds or patterns. Do not visually model the drawing surface aesthetic as anything other than what a person would reasonably expect to be a writing surface. Avoid the use of unusual colors or patterns that do not in any way resemble a drawing surface. This does not mean all interface should be skeuomorphic, it simply means they should not be so far removed from a writing experience that it is jarring to users.

Improvements for Mechanix:

Optimize idle CPU load. The CPU is always being "used" in Mechanix, even during idle time. This greatly drains batteries and runs hardware hot, which in a more tablet-based industry is quickly becoming unacceptable for users.
Save student progress on the database. Maintaining saved, partially completed homework assignments locally is not portable. Since the MCX system is online-only, there shouldn't be "I left my homework at home" scenarios of the offline world.
Re-think the items displayed on the drop-down menus. Some features are experimental, some antiquated, some buggy. For the user build, we should comb through what is shown on these and remove the ones that aren't intended for a student to use in his or her homework assignment.
Alter color palette. Some of the contrasting colors can be rethought to better suit the homework style for MCX. Green-on-white for the description of solved problems, for instance, strains the eyes too much.
UI scaling. Although this is a big task, most computers having a wide range of resolutions among them means Mechanix should implement UI scaling to preserve the user experience. Users with lower resolutions have considerably more trouble with much smaller sketching areas.

Sunday, September 28, 2014

Visual Similarity of Pen Gestures

Bibliographical Information:
Long, Landay, Rowe, Michiels. "Visual Similarity of Pen Gestures." CHI '00 Proceedings of the SIGCHI conference on Human Factors in Computing Systems. Pages 360-367. ACM New York, NY, USA

URL:
http://dl.acm.org.lib-ezproxy.tamu.edu:2048/citation.cfm?id=332458

This paper provides somewhat of an early look into computational interpretation of sketches. Specifically, the authors gear this research toward one-stroke gestures intended to be used as commands, as was common the year this paper was published. PDA devices with smaller screens relied on one-stroke gestures to perform commands, and while it proved convenient, many felt frustration and confusion when the system failed to interpret their gestures as intended. The paper presents two experiments intended to provide better insight into how humans perceive "similarity" between two gestures. Indeed, one of the more common complaints when a user is presented with a gesture recognition failure is "but my gesture really does look like a triangle!" Providing insight into why and how a human can perceive similarity is what leads to the paper's main motivation, that being the identification of what the paper calls "perceptual similarity", and whether this similarity can be computed empirically. Both experiments involve users being presented with similar shapes, with the user selecting the shape in the set that is the least similar. Computationally, the authors generated a set of gesture features that played the most vital role in determining how something can be perceived as "similar" or "different" based on these responses. The first experiment involved the user picking between a set of three gestures, called triads. All triads were seen exactly once. The second experiment involved three new gesture sets of nine gestures each, with those gestures being the point of comparison. The paper presented a computable model to determine this perceptual similarity, which they found to coordinate with reported similarity with a factor of 0.56. The authors found these results to be encouraging and significant, as a computational model was now possible to help determine what could previously only be a subjective analysis.

The paper provides a good foundation for the concept of perception and what would eventually become geometric recognition. I did find some issues with the fact that none of these gestures compared were drawn by users, but rather picked as shapes from a menu of choices. While the significant reduction in experiment time was important, the task being observed here is one of simple image comparison, which is very different from the actual practice of a user's gesture and how a computer would perceive it. Additionally, even though subjectivity is important to capture, relying on pure subjective analysis between individual users to then generate a computational formula to replicate it would only reflect the differences in perceptions of that particular group at best. However, this paper still did provide a significant first step toward computationally determining similarity as it would be reported by a human being.

Wednesday, September 24, 2014

Specifying Gestures by Example

Bibliographical Information:
Rubine. "Specifying Gestures by Example." SIGGRAPH '91 Proceedings of the 18th annual conference on Computer graphics and interactive techniques. Pages 329-337. ACM New York, NY, USA.

URL:
http://dl.acm.org/citation.cfm?id=122753

This paper focuses on the discussion of GRANDMA (Gesture Recognizers Automated in a Novel Direct Manipulation Architecture), which is a toolkit for rapidly adding gesture to an interface designed to recognize sketches. At the time of the paper's publication, most sketch templates of gestures to be recognized required careful hand coding and maintenance as more gestures were added. This tool, however, provides a system for the owner of the gesture curator to continue adding new gestures via example. By implementing gesture recognition on a "curator" level, gestures themselves are saved as templates that can then be used identify input from users. This algorithm is considered lightweight enough for implementation as part of the back-end of some larger sketching system. GRANDMA uses a set of 13 features, which analyzes the input of the new template sketches to create classifiers on its own when creating these templates. Some of these include cosine and sine are included to determine the stroke's angle, bounding box, length, duration of the gesture, "sharpness" of the gesture (to help differentiate between, for example, the letters "U" and "V"), among others. These help generate attributes of these sketches that will then be associated as features of said templates.

This paper presents valuable insight into the on-the-fly generation of templates and the associated features. I find the inclusion and simple explanation of each of the mathematical formulas used to identify the features was useful in explaining the simplicity of the system. Indeed, one of the bigger features of the system is the fact that it is lightweight. However, the system was not implemented in a case study, nor was it shown to work in an end application as it was designed to be. It also spoke about some design decisions such as whether to recognize or not recognize ambiguous user input that is likely to be mislabeled, but since the application was not shown to have a "field test" it is unclear if these decisions were useful or based on user data.

Tuesday, September 23, 2014

Who Dotted That ‘i’? : Context Free User Differentiation through Pressure and Tilt Pen Data

Bibliographical Info:
Eoff, Hammond. "Who Dotted That ‘i’? : Context Free User Differentiation through Pressure and Tilt Pen Data." Graphics Interface (GI 2009), Kelowna, British Columbia, Canada, May 25-27, 2009, pp. 149--156.

URL:
http://srl.tamu.edu/srlng_media/content/objects/object-1236962325-cefe7476d664dc727f969660eac672cc/bde-GI-FinalVersion.pdf

This paper explored the concepts behind the differentiation of different writers only through the analysis of the strokes that each writes. This is decidedly different from systems where the user's identity is preserved through differentiation of the actual pen or calibration of any kind. The algorithms presented in this paper have the intention of on-the-fly user identification to make the writing experience more intuitive for the user. The paper's studies revolved around taking extensive usage metrics of different people, and running them through different classifiers to help create a threshold between which writing data belonged to which user. A number of t-tests were calculated in the first experiment to aid in the classification of users depending on the X- and Y-tilt of the user's pen. The second experiment used additional data from users and several different classifiers were used to analyze the data, including Linear, Quadratic, Naive Bayes, Decision Trees, and Neural Networks. The identification rate for two collaborating users was found to be around 97.5%.

The value presented in this paper is mostly due to the showcase in the power of data classifiers. Despite the fact that user identification when collaborating on a single surface is a problem that has been solved via a variety of different peripherals and complicated user schemes throughout the years, the solution presented in this paper is a powerful one generated exclusively through the analysis of user metrics, without the need for the user to change any of his or her writing habits. One thing to note, however, is the fact that pen tilt is one of the more important aspects of the data, which is something that not every digital pen can provide. In addition, different pressure points per stroke is also not something that pen can provide, and finger-touch interfaces were not explored. Nevertheless, this analysis is a perfect example of how running data through classifiers can be the first big step in solving complex problems such as this one.

Wednesday, September 17, 2014

Mechanix: A Sketch-Based Tutoring and Grading System for Free-Body Diagrams

Mechanix is a tutoring and homework system aimed specifically at support free-body diagrams for both students and instructors. Due to the increasing size of introductory-level physics, mechanical and civil engineering courses, the ability for an instructor to provide useful and timely feedback to students on free-body diagrams is becoming increasingly more difficult. Mechanix automates most of the process of comparison and saving information from students and instructors via extensive use of geometric recognition. The student is asked to draw free-response solutions on in an area, where the points that comprise each stroke are then analyzed for different patterns. These patterns include rule sets that make up most basic geometric shapes, such as triangles, circles, squares, and lines of varying length and angle. These are then compared against other immediately adjacent points to determine whether this shape is part of a larger, more complex one. For instance, the letter "X" has two strokes of varying points each. First, the system determines that the line of points comprises one of the two strokes (recognizing the line " / "), and the ladder-based system then recognizes the other (" \ "). The system then detects that the two lines are intersecting in the midpoint, thus Mechanix recognizes "X" to be drawn. This same concept is carried over to all the other shapes that Mechanix recognizes, including complex trusses. The system then uses the input provided by the instructor to determine if the students' answers match those of the instructor via the same concepts of geometric recognition. Piecemeal feedback is given to students to help guide them to the correct answer without too many or too few hints. Mechanix uses a centralized server system to perform sketch recognition on answers as well as prevent students from cheating by not saving the instructors' answers locally.

I believe Mechanix's true potential is in the fact that it can be expanded to a very large variety of domains of education. It can even be used without much work in a language learning environment, where a student can learn Japanese Kanji or Arabic writing with the system able to provide useful feedback on the correct use and composition of particular characters. Additionally, the system provides ample room for other students to generate a particular sort of crowd-sourcing element to the learning experience, where a limited version of the instructor mode could be made available to students to quiz each other and further provide tutoring options. One of the potential drawbacks of the system, however, is that the instructor input will be useful only in the context of Mechanix. A system could be made to easily export or import data into or out of Mechanix to let instructors use the questions they write to be used in written materials, exams, or other classes that do not integrate Mechanix into the classroom.

Monday, September 15, 2014

K-Sketch: A "Kinetic" Sketch Pad for Novice Animators

Bibliography Information:

Davis, Landay. "K-Sketch: A "Kinetic" Sketch Pad for Novice Animators." 2008. CHI Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Pages 413-422. ACM New York, NY, USA ©2008

Link to Article: http://dl.acm.org/citation.cfm?id=1357122

K-Sketch is a system developed to present an animation platform aimed at facilitating maniuplation of animation user interfaces, mostly due to a need to learn to animate and to do so fast in the middle of meetings, classes, and other functions. Many of the features of K-Sketch include using one or two different forms of interaction, such as Alt-Click, were implemented to support multiple major animation features at once. The supported features of animation are Translate, Scale, Rotate, Set Timing, Move Relative, Appear, Disappear, Trace, Copy Motion, and Orient to Path. Another major design consideration is that this system is developed to have pen be the main method of interaction. Edits to the sketch are "recorded" as actions that can then be used as data that can be replayed as an animation. Erasing a sketch, for instance, is saved as an "action" and displayed visually as a "tick" in the time slider bar on the bottom aspect of the interface. The time slider can be manipulated backwards to allow support of animations running in parallel. For a user study, researchers had testers compare the interface experience with that of the PowerPoint animation tool. Many users reported an easier time using this interface over the PowerPoint tool, as evidence by the fact that the time spent to make similar animations was shown to take longer with PowerPoint. Users also expressed that they had an easier time learning the interface.

I think that easier animation tools, especially for use in education settings, has been in demand for a significant period of time. However, I believe some of the design decisions of the interface itself could be altered slightly. For instance, the button bar on the top could be changed into a slightly hierarchical style, where tapping and holding a pen color would expand into a wheel that would let the user choose between 8 or so colors. Cut, Paste, and Copy could also be placed under a single category. The main reason why I believe this change should be made is that modern interfaces have moved away from a large array of buttons displayed at once, since analyzing each button initially would be detrimental to the experience. For instance, a spontaneous need to sketch an animation in the middle of a meeting would lead a new user to quickly scan the toolbar. A user who only needs to sketch without any need for additional pen colors would waste valuable time and cognitive load identifying pen colors she knows she will not need. I believe these changes can further improve usage time.

Thursday, September 11, 2014

iCanDraw? Using Sketch Recognition and Corrective Feedback to Assist a User in Drawing Human Faces

Bibliography Information:

Dixon, Prasad, Hammond. "iCanDraw? Using Sketch Recognition and Corrective Feedback to Assist a User in Drawing Human Faces." 2010. CHI Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Pages 897-906. ACM New York, NY, USA ©2010

Link to Article: http://dl.acm.org/citation.cfm?id=1753459

iCanDraw? is a system that uses the principles of sketch recognition to teach participants about basic techniques behind sketching portraits. Researchers use a facial recognition library that provides them with 40 data points including contours of the mouth, wings of the nose, contour of each eyebrow, etc. Researchers then make manual corrections if necessary, and these points are then derived again from a user's input sketch based on the same image and uses bounding boxes and line end points to derive the "correctness" of the sketch. Correctness, in this instance, is defined by a user's similarity to the presented portrait in basic contour composition of major facial features like eyes, mouth, and nose. Five users participated in this study who identified themselves as unable to draw well, and they recognized the program's ability to discern correctness of their sketch's composition. They also appreciated contextual feedback based on areas of improvement in their sketch.

I think this paper shows a creative and productive use of facial recognition algorithms beyond the obvious identification of individuals. I appreciated the inclusion of the drawing instructor's evaluation of this program and his thoughts on its approach of teaching. Some of the methods of evaluation could have better been discussed with a little more detail, such as how they perceived the system's ability to improve the participants' skill. Additionally, I found it interesting that several of the participants' initial freehand drawing showed focus on the subject's clothing. In the case of the baby, even individual lines were drawn to indicate the grid-like pattern of the baby's shirt. However, in the second sketch shown after the participant has used iCanDraw?'s systems we see considerably less focus on non-facial features, presumably due to the program's focus in facial features. This could be used in an educational setting where these algorithms can be applied to areas where the participant is known to have skill gaps, since this focus clearly shows that users are encouraged to place more attention in some areas over others. Further evaluations of shifts in focus as a result of using this system could yield interesting results.