Despite the lack of a perfect representation of visual objects, supervised learning has provided a natural and successful framework for studying object recognition.

Some make careful choices of the representation of the object, so as to obtain a rich and meaningful feature descriptor; others put more emphasis on the “learning” aspect of the recognition task, making use of more sophisticated learning techniques.

Many methods on shape matching could be divided into:

**Feature-based methods**extract points from the image (usually edge or corner points) and reduce the problem to point set matching.- capture the structure of the shape using the medial axis transform
- Hausdorff distance
- Geometric hashing uses configuration of keypoints to vote for a particular shape.

**Brightness-based methods**work with the intensity of the pixels and use the image itself as a feature descriptor.

**Shape Context** works well with** Learning techniques**, eg., nearest neighbor

- PCA for face recognition
- BAyes classifier and Decision tree learning, for more general object recognition tasks
- Boosting for feature selection
- SVM for template matching, e.g. recognizing pedestrians
- Neural Network works in digit classificaiton

A broad class of **object recognition** problems have beneﬁted from** statistical learning machinery**.

**The basic idea of Shape Context:**

- Take
**N sample**s from the edge elements on the shape.- The points can be on internal or external contours. No need to correspond to keypoints such as maxima of curvature or inflection points

- Calculate the
**vector**originating from one point to all other points in the shape- Vectors express the appearance of the entire shape relative to the reference point.
- The set of Vectors is a rich description.
- N gets large, the representation of the shape becomes exact.
**Euclidean Distance**r and angle a, Then**Normalize r**by the median distance, measure the angle relative to the positive x-aixs.

- Compute
**the log of the r vector**- histogram could distinguish more finely among differences in nearby pixels, so use
**log-polar coordinate**system.

- histogram could distinguish more finely among differences in nearby pixels, so use
- For each origin point, capture
**number of points**that lie a given bin.- a coarse histogram of the relative cooridnates of the remaining points.
- The reference orientation could be absolute/relative to a given axis, depending on the problem setting

- Each shape context is a
**log-polar histogram**of the coordinates of the N-1 points measured from the origin reference point

Shape Context encodes a description of the density of boundary points at various distances and angles, it may be desirable to include an** additional feature** that accounts for the local appearance of the reference point itself.

- local orientation
- vectors/filters outputs
- color histograms

**Matching Shape Context**

How can we assign the sample points of ShapeP to correspond to those of SahpeQ?

- Corresponding points have very similar descriptors
- Matching cost = weighted shape context + weighted local appearance
- shape context
**distance**between the two normalized histograms - local appearance is the
**dissimilarity**of the tangent angles. - The dissimilarity between two shapes can be computed as the
**sum of matching errors**between the corresponding points, together with a term measuring the**magnitude of the aligning transform**. **modelling tansform:**given a set of correspondences, estimate a transformation that maps the model into the target, e.g.,**Euclidean, Affine, Thin Plate Spline**etc.

- the correspondences are unique
**Bipartite matching shape for correspondence**

**Classiﬁcation based on a ﬁxed shape**

The** nearest neighbor** classiﬁer effectively weighs the information on each sample point equally. Yet, usually some

parts of the shape are better in telling apart two classes.

Given a dissimilarity measure, a** k-NN technique** can be used for object classification/recognition

**Advantages**

incorporates invariance to:

**Translation****Sclae****Rotation****Occlusion**

**Drawbacks**

- Sensitive local distortion or blurred edges
- problems in cluttered background

**Application**

- Digit recognition
- Silhouette similarity based retrieval
- 3D object recognition
- Trademark retrieval

**Database for evaluation**

**MNIST datasets**of handwritten digits: 60000 training and 10000 testing digits**MPEG-7 shape silhouette:**core experiment CE-Shape-1 part B, 1400 images with 70 shape classes, 20 images per class**COIL-20 database for 3D recognition:**20 common household objects; turn ever 5 degree for a total of 72 views per object**Trademark database**: 300 different real-world trademark