Gap between 【shape matching】 and 【object detection】

Classic non-rigid shape matchers

produce point-to-point correspondences, but need clean pre-segmented shapes as models.

Most of the methods train without annotated object segmentations, localizing objects in test images only up to a bounding-box, rather than delineating the outlines, because of

    •  the nature of the proposed models,
    • the difficulty of learning models from real images, but hand-segmented shapes.
    • The models are typically composed of rather sparse collections of contour fragments with a loose layer of spatial organization on top.

Few researchers, even go to the extreme end of using individual edgels as modeling unites; while others use an explicit shape model formed by continuous connected curves completely covering the object outlines

Learn complete shape models directly from images

Match the learned model to cluttered test images automatically, thereby localizing novel class instances up to their boundaries.

Given image windows containing training instances, learn the prototypical shape of an object class, and a statistical model of intra-class deformations

The challenge:

    • To determine which contour points belong to the class boundaries,
    • which discarding background and details specific to individual instances
    • how to do this from the majority of points with a poor signal-to-noise ratio
    • Intra-class variability – the shape of the object boundary varies across instances
  1. Automatic initialization for the location and scale of the object from a Hough-style voting scheme. in cluttered test images where object boundaries cover only a small fraction of the contour points.
  2. Shape matcher only search over transformations compatible with the learned, class-specific deformation model. – to ensure output shape similar to class members, improving accuracy, avoiding local minima

Shape description and matching for modeling

  • earlier work for shape description are based on silhouettes
    • silhouettes are limited, they ignore internal contours, difficult to extract from cluttered images
  • more recent work use loose collections of 2D points/other 2D features
  • other work propose more informative structures (rather individual points) to simplify matching, e.g.
    • Shape Context(semi-local representation) – captures spatial distribution of all other points relative to one point on the shape; establish point-to-point correspondences between shapes
    • Encoding relations between all pairs of edgels – go beyond individual edgels
    • PairWise spatial relations between landmark points
    • A family of scale-invariant local shape features formed by short chains of connected contour segments – can cleanly encoding pure fragments of an object boundary

Object recognition

Using shape to recognize object could be casted as finding correspondences between model and image features.

Accepting sub-optimal matching solutions:

    • when the shape is not deformable/we are not interested in recovering the deformation, rather in localizing the object up to translation and scale, simple strategies can be applied:
      • Geometric Hashing
      • Hough Transform
      • exhaustive search (typically combined with Chamfer Matching/Classifiers

localization of object classes

The overall shape model  could be divided into:

(a) a global geometric organization of edge fragments

The global one could handle deformations.

      • regularized Thin Plate Splines is a generic deformation model that can quantify dissimilarity between any two shapes, but cannot model shape variations within a specific class.
      • Learn the intra-class deformation modes of an elastic material from clean training shapes
      • Active Shape Models – shape model in novel images is constrained to vary only in ways seen during training.
      • A few principal deformation modes (for most of the total variability over the training sets)are learnt using PCA

(b) an ensemble of pairwise constraints between point features

Modeling of intra-class shape deformations

For some classes of objects the local appearance contains very little information, but they are easily recognised by the shape of their contour.

Common to all methods, to define a distance measure between shapes, and then try to find minima of this distance. e.g., Chamfer matching define the average distance from points on the template shape to the nearest points on the image shape. But Chamfer matching does not cope well with clutter and shape deformations.

The first step towards shape-based object detection is to extract potential object contour points from the input image, which then are compared to a shape template.

Deformable template matching provides a way of measuring the distance between two shapes, they cannot be regarded as object detection methods: they assume that there are not many spurious edges, hence they either require clean images of the objects without clutter, or ensuring the optimisation is not mislead by the clutter.

Objects shall have closed contours, and clutter edges will normally not form closed contours.

Edge map only consists of closed contours, which cannot be easily achieved by pixel-wise edge detection, segmentation is ok, since the boundaries of segments obviously are closed.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s




Just another site

Jing's Blog

Just another site

Start from here......







Just another site

Where On Earth Is Waldo?

A Project By Melanie Coles

the Serious Computer Vision Blog

A blog about computer vision and serious stuff

Cauthy's Blog

paper review...

Cornell Computer Vision Seminar Blog

Blog for CS 7670 - Special Topics in Computer Vision


Life through nerd-colored glasses

Luciana Haill

Brainwaves Augmenting Consciousness



Dr Paul Tennent

and the university of nottingham

turn off the lights, please

A bunch of random, thinned and stateless thoughts around the Web

%d bloggers like this: