sketch recognition

Saturday, December 12, 2015

Who dotted that'i'?: context free user differentiation through pressure and tilt pen data.

Paper
Eoff, Brian David, and Tracy Hammond. "Who dotted that'i'?: context free user differentiation through pressure and tilt pen data." Proceedings of Graphics Interface 2009. Canadian Information Processing Society, 2009.
Publication Link: http://dl.acm.org/citation.cfm?id=1555916

Summary

The goal of this paper is described as user identification in a multiuser sketch platform. Akin to establishing a user discriminating model such as handwriting style in the physical world, the system tries to identify features, which are able to distinguish one user from another based on non-sketch related features. The authors hypothesized that features such as pen tilt, pressure, and drawing speed can be used to accurate identify the author of a sketch. Their results indicate that they achieve approximately 84% accuracy in doing so on a 10 user data set.

Discussion

Pros

Its a novel but simplistic idea. The results of their experiments lead one to believe that multiuser sketches are identifiable in real-time.

Cons.

Their original thrust was in multi-user sketch environments. I'm not sure this was the nature of their second experiment and how their results would differ if they had users drawing on the same sketch and basing their test results on that.

SketchREAD: a multi-domain sketch recognition engine

Paper

Alvarado, Christine, and Randall Davis. "SketchREAD: a multi-domain sketch recognition engine." Proceedings of the 17th annual ACM symposium on User interface software and technology. ACM, 2004.
Publication Link: http://dl.acm.org/citation.cfm?id=1029637

Summary

The paper describes the design and implementation of a sketch recognition system, which can be used in multiple domains to recognize hand-drawn diagramatic sketches. SketchREAD uses a context guided sketch recognition approach and utilizes Bayesian networks for interpretation and verification of recognized strokes.
With their approach, the system is able to detect and recover from low level recognition errors.
They evaluated the performance of sketchREAD in comparison with basic bottom-up recognizer on two differing domains: recognition of family tree diagrams, and recognition of circuit diagrams. They also evaluated the runtime performance of their system, noting frequent wostcase running time occurrence on strokes drawn within close proximity.

Discussion

Pros

A novel use of HMMs that is reminiscent of uses in speech recognition and NLP.

Cons.
The description of the BayesNET algorithm wasn't very clear and could have been enhanced with a more simple diagram. Some of the inferences from their figures was also very unclear.

Thursday, December 3, 2015

HMM-based efficient sketch recognition

Paper

Sezgin, Tevfik Metin, and Randall Davis. "HMM-based efficient sketch recognition." Proceedings of the 10th international conference on Intelligent user interfaces. ACM, 2005.
Publication Link: http://dl.acm.org/citation.cfm?id=1040899

Summary

This work proposes a more efficient approach to encoding sketches, using Hidden Markov Model, while maintaining high recognition accuracy. The authors based their approach on the observation that people have an idiosyncratic tendency in the manner in which they input sketches. Deducing from this, the authors use HMMs to encode these consistent trends to develop a fairly efficient recognition algorithm.

Discussion

Pros

A very simple but effective solution. The use of the graph algorithm for search was very elegant.

Cons.

None. Although the length of the paper led to brevity which in turn made the paper a little more difficult to comprehend.

Wednesday, December 2, 2015

LADDER, a sketching language for user interface developers

Paper

Hammond, Tracy, and Randall Davis. "LADDER, a sketching language for user interface developers." Computers & Graphics 29.4 (2005): 518-532.

Publication Link: http://www.sciencedirect.com/science/article/pii/S0097849305000865

Summary

The paper gives an overview of the implementation of a sketch recognition designer development tool. LADDER is described as a simple language that can allow non-developer designers to easily create a sketch recognition interface for a given domain. The major points in the LADDER implementation are generally how to define a shape (using a combination of constraints) and shape recognition, which applies a bottom-up approach. The recognition approach in LADDER allows domain specific shape recognition by defining a per domain collection of Jess-rules.

Discussion

Pros

The tool appears to be a very useful attempt to simplify the sketch recognition interface design.

Cons.

Although the work sounds quite novel and very useful if implemented correctly, i'm concerned about its effectiveness because of its generic nature. The authors do provide some base level domain descriptors that may help with this and other challenges, but it still feels like a complicated solution.

Recognizing sketched multistroke primitives

Paper
Hammond, Tracy, and Brandon Paulson. "Recognizing sketched multistroke primitives." ACM Transactions on Interactive Intelligent Systems (TiiS) 1.1 (2011): 4.
Publication Link: http://dl.acm.org/citation.cfm?id=2030369

Summary

The motivation behind this paper is to develop an algorithm that handles one of the major flaws in the way humans draw sketches. The authors use the term "multistroke" to describe the process of modifying a sketch using an additional stroke. This, they highlight, causes a number of issues for highlevel recognizers, which typically identify the two strokes as independent sketches.

The authors propose a novel graph technique that is able to group strokes together using a novel graph building and linear search algorithm that identifies potential objects as composite strokes of strongly connected components.

Discussion

Pros

Their approach to using strongly connected components is very interesting and creative. I think that making the graph building and search algorithm suitable for real time applications was interesting.

Cons.

What would the impact of adding time to their algorithm be? How much cost is associated with using a time threshold in cases when a person goes back much much later to add additional strokes? Perhaps combining time and proximity might help solve the arrow issue?

Tuesday, November 17, 2015

Recognizing text through sound alone.

Paper
Li, Wenzhe, and Tracy Anne Hammond. "Recognizing text through sound alone." Twenty-Fifth AAAI Conference on Artificial Intelligence. 2011.
Direct Link: http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/download/3791/4119

Summary

This paper introduces a novel approach to sketch recognition using the sound profile of a sketch drawn by scratching on a surface. The authors combine time domain features (mean amplitude) with frequency domain features (Mel-Frequency cepstral coefficients) from the sound profile, after pre-processing to achieve 80% recognition accuracy on recognizing letters in the alphabet sketched out in a constrained manner.

Discussion

Pros

The work is quite novel. I believe GoogleX came up with an idea that is somewhat similar to this one, using similar properties of surface interaction as input for android.

Cons.

The input is constrained to a given surface and a given user.

The input is also constrained to letters drawn out a specific way.

Some of these constraints can be overcome. The authors did not provide metrics on performance without these constraints so the reader can get a sense of how much improvement was gained as a result of so many constraints.

Monday, November 16, 2015

PaleoSketch: accurate primitive sketch recognition and beautification

Paper
Paulson, Brandon, and Tracy Hammond. "PaleoSketch: accurate primitive sketch recognition and beautification." Proceedings of the 13th international conference on Intelligent user interfaces. ACM, 2008.
Publication Link:http://dl.acm.org/citation.cfm?id=1378775

Summary

The work presents a primitive sketch recognition and beautification system known as paleosketch. The idea behind paleosketch is to recognize sketches based on a bottom up approach of identifying low-level primitive shapes as components that combine to form a recognizable high-level shape. The second stage of this system is to return a beautified version of the recognized shape.
To achieve this, they develop two new features in the pre-recognition stage: the normalized distance between direction extremes (NDDE) and the direction change ratio (DCR). The former computes the the difference between the point of highest direction value (ie dy/dx) and the lowest value normalized by stroke length.This feature is able to identify curved shapes (high NDDE values) from poly-lines which have lower NDDE values. The latter DCR value is computed as the maximum change in direction divided by the average change. This value is higher for a poly-line, whereas curves have a much lower value in comparison.

Discussion

Pros

The work is very thorough in presenting the details involved in the implementation.
They introduce two novel features for sketch recognition.

Cons.

A lot of thresholds are used, which are based on training data. Which seems like a lot of tuning. I would like to see how their results change as these different parameters are adjusted.