sketch recognition: 2015

Saturday, December 12, 2015

Who dotted that'i'?: context free user differentiation through pressure and tilt pen data.

Paper
Eoff, Brian David, and Tracy Hammond. "Who dotted that'i'?: context free user differentiation through pressure and tilt pen data." Proceedings of Graphics Interface 2009. Canadian Information Processing Society, 2009.
Publication Link: http://dl.acm.org/citation.cfm?id=1555916

Summary

The goal of this paper is described as user identification in a multiuser sketch platform. Akin to establishing a user discriminating model such as handwriting style in the physical world, the system tries to identify features, which are able to distinguish one user from another based on non-sketch related features. The authors hypothesized that features such as pen tilt, pressure, and drawing speed can be used to accurate identify the author of a sketch. Their results indicate that they achieve approximately 84% accuracy in doing so on a 10 user data set.

Discussion

Pros

Its a novel but simplistic idea. The results of their experiments lead one to believe that multiuser sketches are identifiable in real-time.

Cons.

Their original thrust was in multi-user sketch environments. I'm not sure this was the nature of their second experiment and how their results would differ if they had users drawing on the same sketch and basing their test results on that.

SketchREAD: a multi-domain sketch recognition engine

Paper

Alvarado, Christine, and Randall Davis. "SketchREAD: a multi-domain sketch recognition engine." Proceedings of the 17th annual ACM symposium on User interface software and technology. ACM, 2004.
Publication Link: http://dl.acm.org/citation.cfm?id=1029637

Summary

The paper describes the design and implementation of a sketch recognition system, which can be used in multiple domains to recognize hand-drawn diagramatic sketches. SketchREAD uses a context guided sketch recognition approach and utilizes Bayesian networks for interpretation and verification of recognized strokes.
With their approach, the system is able to detect and recover from low level recognition errors.
They evaluated the performance of sketchREAD in comparison with basic bottom-up recognizer on two differing domains: recognition of family tree diagrams, and recognition of circuit diagrams. They also evaluated the runtime performance of their system, noting frequent wostcase running time occurrence on strokes drawn within close proximity.

Discussion

Pros

A novel use of HMMs that is reminiscent of uses in speech recognition and NLP.

Cons.
The description of the BayesNET algorithm wasn't very clear and could have been enhanced with a more simple diagram. Some of the inferences from their figures was also very unclear.

Thursday, December 3, 2015

HMM-based efficient sketch recognition

Paper

Sezgin, Tevfik Metin, and Randall Davis. "HMM-based efficient sketch recognition." Proceedings of the 10th international conference on Intelligent user interfaces. ACM, 2005.
Publication Link: http://dl.acm.org/citation.cfm?id=1040899

Summary

This work proposes a more efficient approach to encoding sketches, using Hidden Markov Model, while maintaining high recognition accuracy. The authors based their approach on the observation that people have an idiosyncratic tendency in the manner in which they input sketches. Deducing from this, the authors use HMMs to encode these consistent trends to develop a fairly efficient recognition algorithm.

Discussion

Pros

A very simple but effective solution. The use of the graph algorithm for search was very elegant.

Cons.

None. Although the length of the paper led to brevity which in turn made the paper a little more difficult to comprehend.

Wednesday, December 2, 2015

LADDER, a sketching language for user interface developers

Paper

Hammond, Tracy, and Randall Davis. "LADDER, a sketching language for user interface developers." Computers & Graphics 29.4 (2005): 518-532.

Publication Link: http://www.sciencedirect.com/science/article/pii/S0097849305000865

Summary

The paper gives an overview of the implementation of a sketch recognition designer development tool. LADDER is described as a simple language that can allow non-developer designers to easily create a sketch recognition interface for a given domain. The major points in the LADDER implementation are generally how to define a shape (using a combination of constraints) and shape recognition, which applies a bottom-up approach. The recognition approach in LADDER allows domain specific shape recognition by defining a per domain collection of Jess-rules.

Discussion

Pros

The tool appears to be a very useful attempt to simplify the sketch recognition interface design.

Cons.

Although the work sounds quite novel and very useful if implemented correctly, i'm concerned about its effectiveness because of its generic nature. The authors do provide some base level domain descriptors that may help with this and other challenges, but it still feels like a complicated solution.

Recognizing sketched multistroke primitives

Paper
Hammond, Tracy, and Brandon Paulson. "Recognizing sketched multistroke primitives." ACM Transactions on Interactive Intelligent Systems (TiiS) 1.1 (2011): 4.
Publication Link: http://dl.acm.org/citation.cfm?id=2030369

Summary

The motivation behind this paper is to develop an algorithm that handles one of the major flaws in the way humans draw sketches. The authors use the term "multistroke" to describe the process of modifying a sketch using an additional stroke. This, they highlight, causes a number of issues for highlevel recognizers, which typically identify the two strokes as independent sketches.

The authors propose a novel graph technique that is able to group strokes together using a novel graph building and linear search algorithm that identifies potential objects as composite strokes of strongly connected components.

Discussion

Pros

Their approach to using strongly connected components is very interesting and creative. I think that making the graph building and search algorithm suitable for real time applications was interesting.

Cons.

What would the impact of adding time to their algorithm be? How much cost is associated with using a time threshold in cases when a person goes back much much later to add additional strokes? Perhaps combining time and proximity might help solve the arrow issue?

Tuesday, November 17, 2015

Recognizing text through sound alone.

Paper
Li, Wenzhe, and Tracy Anne Hammond. "Recognizing text through sound alone." Twenty-Fifth AAAI Conference on Artificial Intelligence. 2011.
Direct Link: http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/download/3791/4119

Summary

This paper introduces a novel approach to sketch recognition using the sound profile of a sketch drawn by scratching on a surface. The authors combine time domain features (mean amplitude) with frequency domain features (Mel-Frequency cepstral coefficients) from the sound profile, after pre-processing to achieve 80% recognition accuracy on recognizing letters in the alphabet sketched out in a constrained manner.

Discussion

Pros

The work is quite novel. I believe GoogleX came up with an idea that is somewhat similar to this one, using similar properties of surface interaction as input for android.

Cons.

The input is constrained to a given surface and a given user.

The input is also constrained to letters drawn out a specific way.

Some of these constraints can be overcome. The authors did not provide metrics on performance without these constraints so the reader can get a sense of how much improvement was gained as a result of so many constraints.

Monday, November 16, 2015

PaleoSketch: accurate primitive sketch recognition and beautification

Paper
Paulson, Brandon, and Tracy Hammond. "PaleoSketch: accurate primitive sketch recognition and beautification." Proceedings of the 13th international conference on Intelligent user interfaces. ACM, 2008.
Publication Link:http://dl.acm.org/citation.cfm?id=1378775

Summary

The work presents a primitive sketch recognition and beautification system known as paleosketch. The idea behind paleosketch is to recognize sketches based on a bottom up approach of identifying low-level primitive shapes as components that combine to form a recognizable high-level shape. The second stage of this system is to return a beautified version of the recognized shape.
To achieve this, they develop two new features in the pre-recognition stage: the normalized distance between direction extremes (NDDE) and the direction change ratio (DCR). The former computes the the difference between the point of highest direction value (ie dy/dx) and the lowest value normalized by stroke length.This feature is able to identify curved shapes (high NDDE values) from poly-lines which have lower NDDE values. The latter DCR value is computed as the maximum change in direction divided by the average change. This value is higher for a poly-line, whereas curves have a much lower value in comparison.

Discussion

Pros

The work is very thorough in presenting the details involved in the implementation.
They introduce two novel features for sketch recognition.

Cons.

A lot of thresholds are used, which are based on training data. Which seems like a lot of tuning. I would like to see how their results change as these different parameters are adjusted.

Friday, November 6, 2015

Combining corners from multiple segmenters.

Paper
Wolin, Aaron, Martin Field, and Tracy Hammond. "Combining corners from multiple segmenters." Proceedings of the Eighth Eurographics Symposium on Sketch-Based Interfaces and Modeling. ACM, 2011.
Publication Link: http://dl.acm.org/citation.cfm?id=2021185

Summary

This work proposes a very interesting meta-classifier for corner finding. By treating the outputs of an ensemble of corner-finding classifiers as features (candidates for corners in a given sketch), the authors develop a corner subset selection process using a weighted error metric to select the best subset of features (which ideally should be the corners contained in the sketch).
They test and compare the performance of their classifier with existing methods to show an improvement in all-or-nothing accuracy.

Discussion

Pros

I like the approach of treating corner candidates as features and selecting the best set.

Cons.

I wonder how this system will work with n-1 corner finding classifiers. I either missed that or it isn't included in the work. The authors motivate the advantage of each classifier, but can they perform just as well or better (or worse) with one less classifier?

Monday, November 2, 2015

Revisiting ShortStraw: improving corner finding in sketch-based interfaces

Paper
Xiong, Yiyan, and Joseph J. LaViola Jr. "Revisiting ShortStraw: improving corner finding in sketch-based interfaces." Proceedings of the 6th Eurographics Symposium on Sketch-Based Interfaces and Modeling. ACM, 2009.
Publication Link: http://dl.acm.org/citation.cfm?id=1572759

Summary

This paper proposes a set of improvements to a previous corner finding algorithm titled shortstraw. In this work, the authors address the following shortcomings from the previous version of shortstraw:
1. Rigid resampling rate
2. The occurrence and consequences of a false corner. The authors address this by using a two staged corner finding stage with aggressive and relaxed threshold values in respective order.
3. The use of a dynamic threshold based on length of line segment to avoid missed corners
4. Sharp noise resulting from resampling: They device an elegant solution using the the behavior of angles around a point on an arch vs a corner to discern the difference between the two.

Discussion

Pros

Addresses much of the shortcomings of the shortstraw paper.
The work is well presented. Their solution to sharpnoise is very elegant

Cons.

It makes the shortstraw solution a little more complicated to implement.
I think their results should have focused more on the specific errors they were trying to address and not the overall performance metrics. Example, isolate cases where shortstaw falsely detects a corner for reason x from the improvement list. Then show how many of those are correctly resolved in the new algorithm.

Saturday, October 31, 2015

ShortStraw: A Simple and Effective Corner Finder for Polylines

Paper
Wolin, Aaron, Brian Eoff, and Tracy Hammond. "ShortStraw: A Simple and Effective Corner Finder for Polylines." SBM. 2008.
Direct Link: http://www.researchgate.net/profile/Tracy_Hammond/publication/220772398_ShortStraw_A_Simple_and_Effective_Corner_Finder_for_Polylines/links/0deec529f76e58d523000000.pdf

Summary

This paper introduces a simplistic approach to corner finding using the concept of straws. Straws, as defined in the paper, a very simple way of finding the arc length of the angle between lines, identifying a characteristic property (threshold) for corners and using this approach to discover them. The system implements both bottom up and top down approach. After discovering candidate corners, it further filters corners by comparing consecutive candidates for straightness, ie if they are corners, then they shouldn't form a straight line.

Discussion

Pros

Simple yet effective approach.
They present thorough comparison of their work and existing work along with insights on limitations of the work.

Cons.

If anything, i would have liked to see data on effects of window size on performance. Also, some of the suggested improvements could have easily been tested and presented.

Monday, October 26, 2015

A domain-independent system for sketch recognition.

Paper
Yu, Bo, and Shijie Cai. "A domain-independent system for sketch recognition." Proceedings of the 1st international conference on Computer graphics and interactive techniques in Australasia and South East Asia. ACM, 2003.
Publication Link: http://dl.acm.org/citation.cfm?id=604499

Summary

In this work, Bo et al. developed a user interface for recognizing low-level and high level sketches without prior domain knowledge as a source for performance improving constraints. Their system is able to recognize smooth curves, hybrid shapes and polylines.

Discussion

Pros

One interesting merit is their ability to use only low-level geometric content of the strokes to achieve object recognition.
They make very good use of diagram illustrations that help to understand some of the poorly described algorithms used in the paper.
The equations presented for direction and curvature are interesting. Since they are not cited, I assume they developed this metric, which proved to be quite useful.
Overall I think the work is impressive.

Cons.

I feel that there are too many hard-coded rules used in this work to make it truly 'generic'. The authors themselves attest to this fact.
They also do not provide any useful data indicating where the system fails and why, which i feel is important.

Thursday, October 15, 2015

Sketch based interfaces: early processing for sketch understanding

Paper
Sezgin, Tevfik Metin, Thomas Stahovich, and Randall Davis. "Sketch based interfaces: early processing for sketch understanding." ACM SIGGRAPH 2006 Courses. ACM, 2006.
Publication Link: http://dl.acm.org/citation.cfm?id=1185783

Summary

Sezgin et al. develop an intelligent sketch interface. The system allows free-form sketch input, and subsequently beautifies and identifies basic components entered. This is accomplished in 3 stages: a stroke approximation stage, a beautification stage, and a final recognition stage. The first stage uses an algorithm based on speed thresholds to identify properties of basic strokes. The second stage makes minor adjustments to formalize sketch properties (such as straightness of lines, curves etc) and enforcement of geometric properties such as perceived parallel or orthogonal relationships. While the final stage applies basic object recognition using template matching.

Discussion

Pros

The paper was very well written.
The use of motion characteristics of sketches was very interesting.

Cons.

I wonder why they only use speed. I would think acceleration and other properties relating to movement can provide useful information as well.

Wednesday, October 14, 2015

What!?! no Rubine features?: using geometric-based features to produce normalized confidence values for sketch recognition.

Paper
Paulson, Brandon, et al. "What!?! no Rubine features?: using geometric-based features to produce normalized confidence values for sketch recognition." HCC Workshop: Sketch Tools for Diagramming. 2008.
Direct Link: https://www.cs.auckland.ac.nz/research/conferences/skekchws/proceedings/vlhcc_stws_p57.pdf

Summary

This work develops a novel method to provide uniform confidence measurements to geometrically recognized complex shapes. To achieve this, the authors use a combination of geometric and gesture based features with a quadratic classifier to recognize single stroke primitives such as lines, ellipses, helix etc. They apply feature subset selection to reduce a 44 dimension feature set to about 9 dimensions and provide a direction for future application of their work.

Discussion

Pros

It is an interesting approach to providing confidence measures to geometric features.
The use of feature selection methods resulted in fairly high accuracy, but more important is the demonstration of useful confidence measures in addition to high accuracy.
The paper was very well written.

Cons.

I find it odd that the authors dedicated a full page or more to describing sequential forward selection technique in feature subset selection. To me, it diverted attention away from the primary objective of the work, which was in developing accurate geometric based features with reliable, uniform confidence measures.

They also failed to elaborate on performance. I mean, the primary purpose was not really recognizing but giving confidence estimates. This ability is not at all discussed or shown in their results section. They spent way too much time talking about SFS...

Tuesday, October 13, 2015

Visual similarity of pen gestures

Paper
Long Jr, A. Chris, et al. "Visual similarity of pen gestures." Proceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM, 2000.
Publication Link: http://dl.acm.org/citation.cfm?id=332458

Summary

This research tries to measure similarity between pen/sketch gestures as perceived by human beings. With the ability to measure similarity, the authors hypothesis that the latter will serve as a good input for feature sets that quantify gesture characteristics. These features, they claim, will better differentiate between features in a way that can be captured computationally.

Discussion

Pros

The idea sounds quite novel.
They performed two experiments that were able to validate claims through multiple differing observations.
The method is a simplistic approach that seems to be effective for capturing differentiators that were otherwise not so intuitive (example: the aspect vs. log(aspect) observation).

Cons.

The paper was not easy to read or follow.
They do not do a good job of plainly expressing exactly what they did and what their results were. It felt like pulling a tooth to be honest. But it was fairly good research notwithstanding.

Friday, October 2, 2015

Specifying gestures by example.

Paper
Rubine, Dean. Specifying gestures by example. Vol. 25. No. 4. ACM, 1991.
Publication Link: http://dl.acm.org/citation.cfm?id=122753

Summary

In this paper, Rubine introduces a gesture recognition based framework for a user interface. The primary driver for the recognition used by the GRANDMA system is a set of classical physics based features (13 of them). These features capture the dynamics of a stroke. In doing so, the features try to differentiate between the speed, angular velocity, shape, length and other similar characteristics of features.

Discussion

Pros

The method used to develop these features are very simplistic but appear to be effective based on the results delivered in the paper. I'd be interested in seeing the full dissertation for this work to see what was left out.
The use of graphics and pictorial illustration was masterful. Many of the concepts used in the paper are captured and described beautifully in the text.

Cons.

I find none. This was a very thorough first step at using features to describe gestures (which intuitively are just odd shapes). And on that note, i think a fine motivation for the features would have been tying these features to classical geometry, physics, and psychology. How does the human brain recognize the difference between a circle and a square? Or better: If one were asked to traverse a path (physically) while blindfolded, how does one mentally picture the shape of the path? The mechanics involved in this (how the brain intuitively solves this problem) are not very different from Rubines approach.

Thursday, October 1, 2015

Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes

Paper
Wobbrock, Jacob O., Andrew D. Wilson, and Yang Li. "Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes." Proceedings of the 20th annual ACM symposium on User interface software and technology. ACM, 2007.
Publication Link: http://dl.acm.org/citation.cfm?id=1294238

Summary

The $1 recognizer paper describes a 'simple' template matching algorithm that handles scaling, rotation and translation. The authors describe the algorithm, along with implementation details. They also compare their algorithm with two other well known algorithms used in gesture recogniton (Rubine, and dynamic time warping).

Discussion

Pros

It is a simple algorithm to implement.
The heuristic approach to reducing the iterations for rotation alignment was very creating and interesting.

Cons.

The paper was difficult to follow.
I would like to know if their heuristic for rotation and alignment has some theoretical grounding in psychology. Do people tend to draw or start a drawing based on some mental orientation or projection of the object? Maybe I missed the motivation behind this heuristic but I feel it is a key contribution that should have been better highlighted.
It's simplicity makes it very limited in its capability. I suspect that the size of the 'template' library will grow significantly given its high sensitivity to variation in shapes.

I'd also like to see the running time for this algorithm and how it compares with others.

Wednesday, September 30, 2015

Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams

Paper
Bhat, Akshay, and Tracy Hammond. "Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams." IJCAI. Vol. 9. 2009.
Direct Link: http://ijcai.org/papers09/Papers/IJCAI09-234.pdf

Summary

In this paper, authors use entropy as a singular measure to distinguish between text and shapes in sketches.
The end product is a fairly accurate system that also outputs confidence for classification.

Discussion

Pros

A very elegant solution. I like the approach of encoding angular changes between smoothed stroke points.

Cons.

I can't think of any. Perhaps elaborating on some of the empirical decisions such as value for 'k' and other parameters and the effects of using alternative parameters would have been interesting.

Monday, September 28, 2015

An image-based, trainable symbol recognizer for hand-drawn sketches

Paper
Kara, Levent Burak, and Thomas F. Stahovich. "An image-based, trainable symbol recognizer for hand-drawn sketches." Computers & Graphics 29.4 (2005): 501-517.
Publication Link: http://www.sciencedirect.com/science/article/pii/S0097849305000853

Summary

This paper takes a signal processing approach to sketch recognition, treating sketches in the same way as a digital image block and processing accordingly. In this approach they address the typical problems with using a template matching approach (which was one of many approaches they could have used) such as scaling, rotation and translation.

Discussion

Pros

They also give a more in depth description of the workings of their algorithms along with discussions on limitations of their work.
I thought their analysis was excellent and satisfactorily* thorough in a way that is missing in many of the other papers I've read.

Cons.

The authors highlight that the performance of mean vs. median statistic, in this domain, results in a graceful vs. steep decay respectively, in their discussion of modified Hausdorff distance. Since this was a key contribution in their work, I believe they should show how that this is the case for their specific domain. I don't think the cited argument was sufficient.
Also, what is ink length? This is very vaguely described in a way that I do not understand.

Sunday, September 20, 2015

K-sketch: a'kinetic'sketch pad for novice animators

Paper
Davis, Richard C., Brien Colwell, and James A. Landay. "K-sketch: a'kinetic'sketch pad for novice animators." Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2008.
Publication Link: http://dl.acm.org/citation.cfm?id=1357122

Summary

This paper discusses the development and assessment of an animation tool for novices and experts. The paper covers requirements gathering, discusses major themes that intersect the needs of expert animators, and novices (people who want to but haven't delved into animation). The work is compared with Flash and more thoroughly with Microsoft Powerpoint.

Discussion

Pros

A very thorough user requirements phase and a sizable user evaluation. The authors give a clear description of the final features that were incorporated into the tool and how and why they were selected.

Cons.

I am not sure that either comparison (Flash or Microsoft Powerpoint) was appropriate. Flash, as the authors indicate, is very complicated for a novice, while Microsoft is not primarily an animation tool. And so one would expect it to perform poorly when compared with a tool that's primarily designed for animation.

iCanDraw: using sketch recognition and corrective feedback to assist a user in drawing human faces

Paper
Dixon, Daniel, Manoj Prasad, and Tracy Hammond. "iCanDraw: using sketch recognition and corrective feedback to assist a user in drawing human faces." Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2010.
Publication Link: http://dl.acm.org/citation.cfm?id=1753459

Summary

This paper applies sketch recognition methods to build an automated tool for learning how to draw. The paper describes two iterations of the work and discusses what works and what didn't, along with a brief description of template matching algorithms used in the interface.

Discussion

Pros

I thought it was an interesting and intuitive domain for applying sketch recognition.

Cons.

I thought the evaluations were simplistic and had a small sample size. They also do not compare their work with any existing work to form some type of baseline for their claims.

Sketch recognition algorithms for comparing complex and unpredictable shapes

Field, Martin, et al. "Sketch recognition algorithms for comparing complex and unpredictable shapes." IJCAI. 2011.
Direct Link: http://www.researchgate.net/profile/Tracy_Hammond/publication/220812144_Sketch_Recognition_Algorithms_for_Comparing_Complex_and_Unpredictable_Shapes/links/0deec529f76e510edf000000.pdf

Summary

This work describes the algorithms used in a previously discussed paper, which itself discusses a sketch-based tutoring system, Mechanix. In this work, the authors describe the algorithms used and provide some detail on the accuracy and results for a two class (match vs. no match) classification problem.

Discussion

Pros

Paper is well written and organized. Authors provide sufficient motivation for their work, as well as a good review of previous related work.

Cons.

I found it difficult to understand the use of BFS as described in the paper (even though I understand in principle how and why their approach works, this is not well communicated in the paper).

While the two class problem breakdown works well in the intended application, perhaps it would have been interesting to understand where the system fails (FP and FNs), the 'why' and the 'future' work suggestions.

Mechanix: a sketch-based tutoring and grading system for free-body diagrams

Paper
Valentine, Stephanie, et al. "Mechanix: a sketch-based tutoring and grading system for free-body diagrams." AI Magazine 34.1 (2012): 55.
Publication Link: http://www.aaai.org/ojs/index.php/aimagazine/article/view/2437

Summary

This paper describes the implementation and deployment test results for a domain specific, sketch-based tutoring system, Mechanix. The system, unlike other systems, offers machine intelligence as well as free hand sketch input as solutions to instructor provided truss problems, which are a major feature in some large introductory engineering courses where student enrollment is high, and individual feedback is important but limited.

Discussion

Pros

The paper is very well organized and provides a very thorough motivation for the work and a fairly detailed literature review. The authors conducted some experiments testing the overall function of the system to get a sense of impact on student performance, and enrollment. They also provide performance metrics on recognition accuracy of the system.

Cons

I feel that, since the major advantage of said system is in grading and providing feedback, the authors should have spent a little more time in the paper discussing the system performance on these specific functions vs. the overall system performance. What are the problem areas / challenges in recognition that they encountered during deployment? Providing some idea on how the feedback mechanism was used, and metrics collected to evaluate its performance would be useful to read and as fodder for future work in system improvement.