[paper] Perception and Painting: A Search for Effective, Engaging Visualizations

Christopher G. Healey and James T. Enns  [PDF]

Scientific visualization represents information as images that allow us to explore, discover, analyze, and validate large collections of data. Much of the research in this area is dedicated to the design of effective visualizations that support specific analysis needs. Recently, we have become interested in a new idea: Is a visualization beautiful? Can a visualization be considered a work of art?

One might expect answers to these questions to vary widely depending on the individual and their interpretation of what it means to be artistic. We believe that the issues of effectiveness and aesthetics may not be as independent as they might seem at first glance, however. Much can be learned from a study of two related disciplines: human psychophysics, and art theory and history. Perception teaches us how we “see” the world around us. Art history shows us how artistic masters captured our attention by designing works that evoke an emotional response. The common interest in visual attention provides an important bridge between these domains. We are using this bridge to produce visualizations that are both effective and engaging. This article describes our research, and discusses some of the lessons we have learned along the way.

Multidimensional Visualization

Work in our laboratory has studied various issues in scientific visualization for much of the last ten years. A large part of our effort has focused on multidimensional visualization, the need to visualize multiple layers of overlapping information simultaneously in a common display or image. We often divide this problem into two steps: (1) the design of a data-feature mapping M, a function that defines visual features (e.g., color, texture, or motion) to represent the data, and (2) an analysis of a viewer’s interpretation of the images M produces. An effective M generates visualizations that allow viewers to rapidly, accurate, and effortlessly explore their data.

One promising technique we have discovered is the use of results from human perception to predict the performance of a particular M. The low-level visual system identifies certain properties of what we see very quickly, often in only a few tenths of a second or less. Perhaps more importantly, this ability is display size insensitive; visual tasks are completed in a fixed length of time that is independent of the amount of information being displayed. Obviously, these findings are very attractive in a multidimensional visualization context. Different visual features can be combined to represent multiple data attributes. Large numbers of these “multidimensional data elements” can be packed into an image. Sequences of images are then rapidly analyzed by a viewer in a movie-like fashion.



(a)

(b)

Figure 1: Two examples of visualizing weather conditions: (a) traditional visualizations for each attribute composited into a single image; (b) simulated brush strokes that vary their color and texture to visualize the data


Fig. 1 shows two example visualizations of multidimensional weather data. The first image was constructed by taking traditional visualizations of each attribute, then compositing them together. Hue represents temperature (yellow for hot, green for cold), luminance represents pressure (bright for high, dark for low), directed contours represent wind direction, and Doppler radar traces represent precipitation. The second image was built using simulated brush strokes that vary their perceptual color and texture properties to visualize the data. Here, color represents temperature (bright pink for hot, dark green for cold), density represents pressure (denser for lower pressure), stroke orientation represents wind direction, and size represents precipitation (larger strokes for more rainfall). Although viewers often gravitate towards the first image due to its familiarity, any attempt to perform real analysis tasks leads to a rapid appreciation of the careful selection of colors and textures used in the second image. Experiments showed that viewers prefer the second image for the vast majority of the tasks we tested.

The use of perceptual guidelines can dramatically increase the amount of information we can visualize. We cannot take advantage of these strengths with an ad-hoc choice of M, however. Certain combinations of visual features actively mask information by interfering with our ability to see important properties of an image. A key goal, therefore, is to build guidelines on how to design effective visualizations, and to present these findings in a way that makes them accessible to other visualization researchers and practitioners.

An image that is seen as interesting or beautiful can encourage viewers to study it in detail.

Nonphotorealistic Visualization

explore in two directions:

  • nonphotorealistic rendering in computer graphics, and
  • art history and art theory discussions of known painterly styles.

We observed that many of the painterly styles we discovered seemed to have a close correspondence to visual features from our perceptual visualizations. For example, color and lighting in Impressionism have a direct relationship to the use of hue and luminance in visualization. Other styles like path, density, and length have partners like orientation, contrast, and size in perception. This suggested the following strategy to produce a visualization that is both effective and aesthetic:

  1. Produce a data-feature mapping M that uses the perceptual color and texture patterns that best represent a particular dataset and associated analysis tasks.
  2. Swap each visual feature in M with its corresponding painterly style.
  3. M now defines a mapping from data to painterly styles that control the visual appearance of computer-generated brush strokes; apply this mapping to produce a painted representation of the underlying dataset.

Aesthetics

Although our initial experiments showed that our painterly visualizations are effective, we still had no evidence of their aesthetic merit. We ran a new set of experiments designed to investigate this property. These experiments studied three important questions:

  1. How artistic do viewers judge our painterly visualizations, relative to paintings by artistic masters?
  2. Can we identify any fundamental emotional factors that predict when viewers will perceive an image as artistic?
  3. Can we categorize individual viewers as preferring different types of art (e.g., realism or abstractionism), and how do these preferences impact the emotional responses that predict artistic rankings?



(a)

(b)

Figure 5. Example displays from the aesthetic judgement experiment: (a) a painterly visualization of weather conditions; (b) a nonphotorealistic rendering of a photograph of Lake Moraine in Banff, Canada

Our experiments asked viewers to order 28 images on a scale from 1 (lowest) to 7 (highest). We presented seven images from four different categories: master Impressionists (impressionism), master Abstractionists (abstractionism), nonphotorealistic renderings (nonphotorealism), and painterly visualizations (visualization).

An example of the painterly visualizations we tested is shown in Fig. 5a. Although real weather conditions are being visualized (temperature is represented by color, wind speed by coverage, pressure by size, and precipitation by orientation), no explanations were provided to our viewers about what was being depicted. We were careful to zoom in to a point where viewers would not interpret the image as part of a map. These images were classified as abstract in nature, since they had no obvious relationship to a real-world object or scene. They were paired against seven real paintings by master Abstractionists: one painting each by de Kooning, Johns, Malevich, Mondrain, and Pollock, and two by Kline.

via

http://www.csc.ncsu.edu/faculty/healey/HTML_papers/NPV/NPV.html

Advertisements

Engagement Mobile Media

relevant + connected + involved

While only a small slice wants to blog, a far larger swath is eager to make friends and contacts, to exchange pictures and music, to share activities and ideas.  — 2008

Consumers can no longer be considered ‘the audience’ – they are simultaneously readers, editors and marketers, especially the younger demographics.   —by IBM institute for Business value, 2009


spend

How do we consume multimedia?

The mobile phone has transcended its original role as a means of communication by serving a multitude of purposes.

  • How do you currently use your mobile?
  • What do you like about it? What do you

    hate about it? 

  • Has the mobile changed your life? If so, how?
  • Have you ever switched your provider and, if so, why?
  • What are the imperfections or disadvantages of the mobile lifestyle?
  • How do you spend your commuting time? If you use your phone while commuting, how do you use it?
  • What would you like to be able to do with your phone that you currently cannot?

Current Usage
Typical usage comprises voice conversation and short messaging. The most popular ‘nontraditional’ uses are gaming, listening to radio, multimedia messaging and Internet access.

Staying in touch with their social network is their prime concern.

Ubiquitous availability is challenged primarily by cost, imperfect coverage, “Bad reception defeats the point of a mobile”, short battery life, and losing the phone.

70 percent of UK participants have payas-you-go (non-contract) phones. Many chose their provider according to their social circle’s preferences in order to control and minimise costs.

Changes in life style

Mobile Users’ Needs and Expectations of Future Multimedia Services [pdf]

Challenges in delivering multimedia content in mobile environments 1999 [pdf]

3 Es of social media

  • Educate
  • Engage
  • Entertain

Social Media Marketing

The act of using social influencers, social media platforms and online communities for marketing, publication relations and customer.

social tool

 

【paper】Rendering Parametric Surfaces in Pen and Ink

rendering parametric free-form surfaces in pen and ink:

This paper extends the pen and ink illustration algorithms from Winkenbach et al. 1994 to parametric surfaces. The main difference between this approach and the previous is that the texture and tone cannot be defined on a planar map, but instead must allow for curved surfaces. Controlled density hatching is introduced to maintain tone levels across parameteric surfaces by adjusting the thickness of the strokes used to create the tone and texture. In addition, methods for creating outline strokes and shadows are modified to be used with free-form surfaces.

Art and Cultural Heritages
Content-based search of art image database (1995-1999, 2002- )

reference

Georges Winkenbach and David H. Salesin. “Rendering Parametric Surfaces in Pen and Ink”. In In Proceedings of ACM SIGGRAPH 1996,ACM Press / ACM SIGGRAPH, pp. pp. 469–476, 1996. [html]

[paper]Fast Multiresolution Image Querying

User may paint on a “canvas” of any aspect ratio. However, the painted query is internally rescaled to a square aspect ratio and searched against a database in which all images have been similarly rescaled as a preprocess.

how a user-specified aspect ratio might also be used to improve the match?

To evaluate our image querying algorithm, we collected three types of query data:

  • The first set, called “scanned queries,” were obtained by printing out small 1/2×1/2  thumbnails of our database images. 270: 100 were reserved for evaluating our metric, and the other 170 were used as a training set.
  • The second set, called “painted queries,” were obtained by asking 20 subjects, most of whom were first-time users of the system, to paint complete image queries, in the non-interactive mode, while looking at thumbnail-sized versions of the images they were attempting to retrieve. 270
  • The third set, called “memory queries,” without being able to see the image directly. 100 queries only for evaluation

To see if painting from memory would affect retrieval time: 3 subjects

  • Each subject was then asked to paint the 10 images from the first set while looking at a thumbnail of the image, and the 10 images from the second set from memory
  • median query time increased from 18 seconds when the subjects were looking at the thumbnails, to 22 seconds when the queries were painted from memory.

Users will typically be able to sketch all the information they know about an image in a minute or less, whether they are looking at a thumbnail or painting from memory.

  1. If the query fails to bring up the intended target within a minute or so, users will typically try adding some random details, which sometimes help in bringing up the image.
  2. If this tactic fails, users will simply give up and, in a real system, would presumably fall back on some other method of searching for the image.

benefits of painting queries interactively:

  1. the time to retrieve an image is generally reduced because the user simply paints until the target image appears, rather than painting until the query image seems finished.
  2. the interactive mode subtly helps “train” the user to find images more efficiently, because the application is always providing feedback about the relative effectiveness of an unfinished query while it is being painted.

Multiresolution Query by Image Content

The key to the algorithm is the establishment of an effective and efficient metric capable of computing the distance between a query image Q and a potential target image T. The chosen metric use the YIQ color space and the Haar wavelet.

figure1.gif (82382 bytes)

Wavelet decomposition:

  • few coefficients provides a good approximation of the original image retaining information from existing edges;
  • presents relative invariance to resolution changes;
  • it is fast to compute, running in linear time in size of the image;
  • spatial localization of the frequencies.

Compute Haar wavelet transform For each image, truncating and quantizing its coeffecients. Those remaining coeffecients represent the image signature.

figure3.gif (19639 bytes)

A painting its truncated and quantized wavelet decomposition with 2000 coefficients
(Y color channel)
the actual decomposition used with 60 coefficients
(Y color channel)

Improvement:

To use a low-resolution database, and also use a low-resolution version of the image provided for querying. This speeds up the process substantially and gives essentially the same results since the continuous wavelet transform is invariant under change of scale and almost all of the largest m coefficients are located in the low resolution.

We implemented an estimated perceptual of approximation between the query image and the images in the data base.

  • We also tried to use another Haar decomposition, which is faster than the one used by Salesin et al.. The results were good, in spite of using the same weights of the Haar initially used.
  • shark.gif (360661 bytes)

Stackoverflow:I implemented a very reliable algorithm for this called Fast Multiresolution Image Querying. My (ancient, unmaintained) code for that is here.

What Fast Multiresolution Image Querying does is:

  • split the image into 3 pieces based on the YIQ colorspace (better for matching differences than RGB).
  • Then the image is essentially compressed using a wavelet algorithm until only the most prominent features from each colorspace are available.
  • These points are stored in a data structure.
  • Query images go through the same process, and the prominent features in the query image are matched against those in the stored database.
  • The more matches, the more likely the images are similar.

The algorithm is often used for “query by sketch” functionality. My software only allowed entering query images via URL, so there was no user interface. However, I found it worked exceptionally well for matching thumbnails to the large version of that image.

Much more impressive than my software is retrievr which lets you try out the FMIQ algorithm using Flickr images as the source. Very cool! Try it out via sketch or using a source image, and you can see how well it works.

http://cnx.org/content/m11694/latest/?collection=col10223/latest

Mona Lisa (monalisa.jpg)

person has a coarse-scale idea of what the Mona Lisa, for instance, looks like. This information should be fairly useful for finding an actual image of the Mona Lisa, but given current techniques, searches for visual data break down as effective strategies when the database size increases to even a small fraction of the number of images on the World Wide Web.

Our algorithm should also be well suited to matching coarse-scale versions of images to high detail versions of the same image. Users should be able to sketch an image in a simple drawing application where a lot of detail is not easy to add to the query image. They should also be able to enter images that have been digitized by the use of a scanner, which we assume introduces blurriness and additional noise such as scratches, dust, etc, to the extent that they would find it highly useful to search for a higher-resolution version of the image online.

Ideally, we would also like our algorithm to be able to handle affine transformations, such as translation, rotation, and scaling. It is unreasonable to expect a user to be able to draw parts of an image in exactly the same region that they appear in the original image. While these three transformations are all important components of an image querying system, we made the decision to focus on translation because it seems like the most likely type of error that a user would make.

The primary drawback of FMIQ is that the approach is ineffective for detecting shifts of an image since the separable discrete wavelet basis is not shift-invariant. Therefore, we propose the use of the complex discrete wavelet basis which possesses a high degree of shift-invariance in its magnitude. When coupled appropriately with the two-dimensional Discrete Fourier Transform, the two-dimensional Complex Discrete Wavelet Transform allows us to match shifted versions of an image with a significantly higher degree of certainty than does the approach of Jacobs, et al.

The signatures are computed as follows:

  1. Compute the discrete wavelet transform of the image.
  2. Set all but the highest magnitude wavelet coefficients to 0.
  3. Of the remaining coefficients, quantize the positive coefficients to +1 and the negative ones to –1.
  4. These +1’s and –1’s correspond to the feature points in an image, and basically characterize the image structure. Jacobs et al. concluded, after some experimentation, that on their database, considering the top 60 magnitude coefficients worked well for scanned images, while the top 40 coefficients gave best results for hand-drawn images.
  5. The signatures in our implementation were compared using the generic L1 norm of the difference between signature matrices. Jacobs et al. use the non-intuitive “Lq” norm, which somehow weights the coefficients corresponding to different scales differently. This idea definitely carries some merit, but Jacobs et al. do not provide a very good explanation of this scheme, and we don’t believe that it will improve the performance of their querying algorithm significantly.

【paper】Detecting and Sketching the Common

 

Given few images containing a common object, detect the common object and provide a simple and compact visual representation of that object, depicted by binary sketch.
shapecommon

hand-drawings as model

hand-drawings are simpler, less informative 

class variability: variations among instances within an object class

Chamfer matching methods can detect shapes in cluttered images, but they need a large number of templates to handle shape variations, e.g. 1000, and are prone to produce rather high false-positive rates

a powerful point-matching method based on Integer Quadratic Programming: computational complexity

Besides, [1] uses real images as models, so it is unclear how it would perform when given simpler, less informative hand-drawings.

[2]  based on edge patches

Contour Segment Network

  • dealing with highly cluttered images,
  • allowing intra-class shape variations and large scale changes,
  • working from a single example,
  • being robust to broken edges, and
  • being computationally efficient

brittleness of edge detection – contour is often broken into several edgel-chains

segment the contour chains of the model, giving a set of contour segment chains along the outlines of the object

functionality of pure shape matchers takes a clean shape as input, support matching to cluttered test images

Simple decomposes the hand-drawing into PAS, then uses these PAS for the Hough voting stage, and the hand-drawing itself for the shape matching stage.

representative shape context

  • shape context of which point should represent the image?
  • pixel density based sampling – promote point with higher or lower density?
  • uniformly 

    sampled shape context of the shape may contain redundant information

  • Mori et al. [3] tested the representative shape context method on the Snodgrass and Vanderwart line drawings.

    • Queries were distorted versions the original
    • Embed objects into some clutter:

      • find the outline of the object, construct a binary mask for it, and using logical operations (AND à OR) to copy the clutter around the object.

      • Finding the outline of objects is done using a method similar to flood-fill.
    • Pseudocode for original Representative Shape Context
    • PRE-PROCESSING:
      % Compute shape contexts for known shapes
      PRUNING:
      SCquery = shape contexts for r random pointsforeach known shape Si
      for j = 1 : r
      dist(Squery; Si)+ = minu(2(SCj query; SCui ))
      % Sort dist and truncate to return a
      % shortlist.

A query of a hand-drawn shape is successful if the corresponding known shape is included in the set of retrieved candidate shapes.

 

‘Shape Context and Chamfer Matching in Cluttered Scenes’ – only a single template shape

reference

paper From Images to Shape Models for Object Detection [pdf]

Hierarchical Matching of Deformable Shapes[pdf]

[3] Mori, G., Belongie, S., & Malik, J., (2001) Shape Contexts Enable Efficient Retrieval of Similar Shapes, CVPR.[pdf]

[4] Recognizing hand-drawn images using shape context [pdf]

Shape Template Scaling and Rotation

When the user draws the sketch that will be used as a template, it is in an arbitrary scale and, in general, has an unknown relation with the scale of the objects it has to match.

If we cover the image with a coordinate system (x, y), each interesting objects can be identified by its minimum enclosing rectangles (MER), with sides parallel to the coordinate axes and lower left and upper right corners {(x1, y1), (x2, y2)}. We consider the aspect ratio of the rectangle:

p=(y2-y1)/(x2-x1)

The sketch is similarly enclosed in its MER with extrema{(X1,Y1),(X2,Y2)} which has an aspect ratio:

P=(Y2-Y1)/(X2-X1)

We can assume that the user, while making a query, draws an object approximately with the same aspect ratio of the object he wants to retrieve.

  • For this reason, we can mark as nonmatched all those objects in the image whose aspect ratio is not such that:

1/k<P/p<k (where k is a fixed threshold)

  • All the interesting rectangles that pass this sieve are candidates for matching.

To speed up this checking, aspect ratios are organized into a binary tree index structure.

  • Each node of the tree includes pointers to image rectangles with that aspect ratio.
  • Matching is improved

    if we normalize the sizes of both the template in the 

    sketch and the shape in the image.

33

 

In practice, it is almost impossible for the user to reproduce object mutual orientations exactly as they are in the searched image.

To cope with this inherent imprecision of the user query, given an object oi its orientation with respect to oj was evaluated
by considering the position of the oi centroid with respect to the oj boundaries.

In the very general case of sketches composed of multiple templates, a candidate image is retrieved if and only if:
1) it has two—or more—areas of interest in the same spatial relationships as the templates drawn on the screen;
2) the shapes contained in the areas of interest match the templates of the sketch within a certain degree.

Elastic matching is applied only to images that pass a composite filtering mechanism, based on spatial relationships matching (for multiple templates) and aspect ratio checking (for each template).

  • A threshold k = 2 has been used for the aspect ratio filtering.
  • The average number of steps of the deformation 

    process depends on how much the image and the 

    sketch shapes are similar. 

  • After 20 steps, the match parameter

    M is compared with a fixed threshold.

  • The neural network that derives the similarity ratings,

    was a three layered 5–12–1 back propagation net.

deformation process of a sketched template

1

different steps of the deformation process of a sketched template roughly representing a horse, over one of the two horse shapes in the blurred edge image.

The right Graphs are also shown reporting the values of strain energybend energy, and match in the deformation process.

the effects of increasing the values of a and b during the deformation process:

  • The template starts to warp in a somehow irregular manner, in order to adjust itself to the horse boundary.
  • Deformations which should

    determine a too large expense of strain and bend energy,

    such as the adaptation to the rider contour or to the horse

    legs are not exploited.

  • In the final steps, higher values of a 

    and b impose the template to regularize its deformation on

    the horse shape and, as a consequence, the values of strain 

    and bend energy decrease.

After a template reached convergence over an image shape, we need to measure how much the two are similar.

Similarity is a fuzzy concept, and to measure it we need to take into account a number of things:

  1. A first thing to be taken into account is the degree of overlapping M between the deformed template and the gradient of the image.
  2. Another factor to be considered is how much the template had to warp to achieve that match in terms of strain energy S and bend energy B.
  3. Parameters S, B, and M alone are not enough to operate a good discrimination between different shapes:
    • the values of S and B are somehow depending on the nature of the template shape2

      for each example, the template, the original image and the original image with the deformed template superimposed are shown. 3

      It can be noticed that the deformation of the horse template over the horse shape image Fig. 2a is characterized by values of S and B which are fairly the same as those corresponding to the deformation of the circular template over the coffee-pot image Fig. 2c.

    • A good match of a complex shape can require high values for S and B, a noncomplex shape can reach a good match with very low values of elastic deformation energy.
    • A reliable solution is to consider a measure of the template shape complexity, in addition to the parameters of the deformation process.
      • the complexity of the template is measured as the number N of zeroes of the curvature function associated with its contour.
      • When N is low, as in the case of the circular template, we expect to have low values of S and B for a correct deformation
      • while if N is high, as in the case of the horse template, we consider good values of deformation also values
        of S and B which, otherwise, should be discarded.
  4. S and B give only a quantitative measure of the template deformation
  5. while to estimate the similarity between the template and the image shape we must give also a qualitative measure of the deformation.
    • the correlation C between the curvature function associated with the original template and that associated with the deformed one.

All these five parameters (S, B, M, N, C) are classified by a back-propagation neural network subject to appropriate training.

  • For each input array, the neural classifier gives one output value ranging from 0 to 1, which represents the similarity between the shape in the image and the shape of the template.

 

【paper】Perceptual Distance and Effective Indexing for shape similarity

via this paper in 2000

The use of global attributes presents limitations in modelling perceptual aspects of shapes, and poor performance in the computation of similarity with partially occluded shapes.

More effective solutions have employed local features, such as edges and corners of boundary segments.

  • They are based on the partition of the shape boundary into perceptually significant tokens.
    • Petrakis and Milios have approximated shapes as a sequence of convex/concave segments between two consecutive inflection points. points are indexed using an R-tree.
    • Grosky and Mehrotra have approximated shapes as polygonal curves:
      • for each vertex, a local feature is defined by considering the internal angle at the vertex, the distance from the adjacent vertex, and the vertex coordinates.
      • A fixed number 

        of these local features is extracted from each shape.

      • A shape is thus represented by an attributed string. 
      • Similarity between two shapes is computed as the editing distance between the two strings of the boundary features
    • Mehrotra and Gary have developed

      a retrieval technique known as Feature Index-Based Similar Shape Retrieval.

      • a polygonal approximation of the shape boundary, which is represented by an ordered finite collection of boundary features, each collection represents one segment of the shape boundary. 

      • segments are defined with a fixed number of adjacent vertices
      • Boundary feature vectors are organized in a kdB-tree
      • The matching of one or more features doesn’t guarantee full shape matching 
      • once shapes with similar features are retrieved, shape similarity is checked by overlaying each retrieved shape on the query shape and evaluating the amount of overlap between them

similarity based on polygonal approximation and Euclidean distance between boundary features has little relation with perceptual similarity and cannot be employed for generic shapes.

polygonal shapes of man-made objects

sensitivity to small deviations of feature values is critical to retrieval.

If the features at the root of two subtrees placed on the same level of the index have a very small difference, and a slightly different version of one of them is compared with both, the wrong path might be chosen, thus leading to incorrect retrieval.

To cope with both these requirements two distinct distance measures have been defined: a token distance and a shape distance.

  • The token distance is a metric distance and is used to provide a measure of the similarity between two tokens.
  • The shape distance is a nonmetric distance, defined as a combination of token distances, and is used to derive a global measure of shape similarity which fits humans’ perception.
  • Shape distance does not depend on the ordering of tokens along the curve. This can determine retrieval of false positives.

Indexing is performed at the token level, by exploiting the metric properties of token distance.

SHAPE INDEXING USING A MODIFIED M-TREE

In that a generic token is modelled as a point in the two-dimensional (2-D) feature space of curvature and orientation, the
representation of a generic shape results to be a set of points in this space.

Shape tokens are stored in an M-tree index.

M-tree organizes tokens as a hierarchical set of clusters, each of which is identified by a routing object, (i.e., the center of the cluster), and a covering radius, which determines the maximum distance between the routing object and each tokens included in the cluster

QQ截图20121222201421

1

【paper】Elastic Matching of User Sketches

via this paper

Retrieval by shape similarity, given a user-sketched template is particularly challenging, owing to the difficulty to derive a similarity measure that closely conforms to the common perception of similarity by humans.

  • Matching by shape is complicated by the fact that a shape does not have a mathematical definition that exactly matches what the user feels as a shape. 
  • Well-known distance measures commonly used in mathematics are not suitable to represent shape similarity as perceived by humans
  • Human perception is not a mere interpretation of a retinal patch, but an active interaction between the retinal patch
    and a representation of our knowledge about objects.

related work

  • QVE system: evaluating the correlation between a linear sketch and edge images in the database (High values of correlation require user-drawn shape close to the shape database)
  • shapes are represented as

    an ordered set of boundary features.

    Each boundary is coded as an ordered sequence of vertices of its polygonal approximation.

    • Features are collections of a fixed number of vertices.

    • roughly evaluate

      similarity as the distance between the boundary feature 

      vector of the query and those associated with the target

      images.

    • Boundary features of objects in database images are

      organized into a quite complex index tree structure.

  • QBIC system:
    • Shape representation based on global features such as area, circularity, eccentricity, major axis orientation and moment invariants
    • shape similarity is evaluated as the weighted Euclidean 

      distance in a low dimensional feature space.

there is no warranty that our notion of perceptive closeness is mapped into the topological closeness in the feature space.

Elastic matching promises to approximate human ways of perceiving similarity and to possess a remarkable robustness to shape distortion.

  • the sketch is deformed to adjust itself to the shapes of the objects in the images.
  • The match between the deformed sketch and

    the imaged object, as well as the elastic deformation energy 

    spent in the warping are used to evaluate the similarity

    between the sketch and the image.

  • The elastic matching is

    integrated with arrangements to provide scale and partial 

    rotation invariance, and with filtering mechanisms to prune

    the database.

THE ELASTIC APPROACH TO SHAPE MATCHING

an image I, its luminance at every point normalized in [0,1], we search for a contour with a shape similar to that of sketched template.

in general, the image will contain no contour exactly equal to the template.

It is not just a matter of noisy images, which we can, to a limited extent, model and cope with. The image and the template
can be different to begin with. This makes traditional template matching brittle.

To make a robust match even in the presence of deformations, we must allow the template to wrap. This takes into account two opposite requirements:

  1. it must follow as closely as possible the edges of the image Ie.
  2. the deformation of the template – elastic deformation energy for template, which depends only on the first and second derivatives of the deformation.
  3. In order to discover the similarity between the original shape of the template and the shape of the edge areas on the image, we must set some constraints on deformation.

Wikipedia:

Elastic matching is one of the pattern recognition techniques in computer science.

Elastic matching (EM) is also known as deformable templateflexible matching, or nonlinear template matching.

Elastic matching can be defined as an optimization problem of two-dimensional warping specifying corresponding pixels between subjected images.

Application(signature recognition):

  • Two categories of methods and tools for the analysis of dynamic signatures are presented.
  1. The first, measure analysis, is introduced and used to show how imitations can be differentiated unambiguously from genuine examples.
  2. The second category of elastic matching of signatures is believed to follow the type of mechanism which our visual cortex might use when we examine a pair of signatures.
    • Imagine the reference signature traced onto a transparent elastic sheet. If this is then placed over the reference signature and stretched, they superimpose. A feature specifying the degree of similarity is then the elastic energy contained within the elastic sheet.
    • A complementary feature measures the degree to which the two signatures overlap after the stretching process, the local correlation. The elastic matching method works from these two features.
The linkages between points on two authentic signatures found by elastic matching.
The linkages between points on two authentic signatures found by elastic matching.
The extracted circular arcs from two static authentic signatures and their linkages chosen by elastic matching.
The extracted circular arcs from two static authentic signatures and their linkages chosen by elastic matching.
  • Face Recognition using Elastic Bunch Graph Matching (EGBM) fisherface

Reference

object recognition by elastic graph matching (dynamic link matching): L. Wiskott, J.-M. Fellous, N. Krueger and von der Malsburg C.(1997) Face recognition by elastic bunch graph matching.IEEE PAMI 19: 775–779.

Eckes C, Triesch J, von der Malsburg C. (2006) Analysis of cluttered scenes using an elastic matching approach for stereo images. Neural Comput. 18(6):1441-71.

FBI

Face Recognition: An Introduction

生活在西班牙

自己动手丰衣足食

BlueAsteroid

Just another WordPress.com site

Jing's Blog

Just another WordPress.com site

Start from here......

我的心情魔方

天才遠私廚

希望能做一個分享各種資訊的好地方

语义噪声

西瓜大丸子汤的博客

笑对人生,傲立寰宇

Just another WordPress.com site

Where On Earth Is Waldo?

A Project By Melanie Coles

the Serious Computer Vision Blog

A blog about computer vision and serious stuff

Cauthy's Blog

paper review...

Cornell Computer Vision Seminar Blog

Blog for CS 7670 - Special Topics in Computer Vision

datarazzi

Life through nerd-colored glasses

Luciana Haill

Brainwaves Augmenting Consciousness

槑烎

1,2,∞

Dr Paul Tennent

and the university of nottingham

turn off the lights, please

A bunch of random, thinned and stateless thoughts around the Web