via wikipedia
The shape context is intended to be a way of describing shapes that allows for measuring shape similarity and the recovering of point correspondences.
The basic idea is to pick n points on the contours of a shape. For each point p_{i} on the shape, consider the n − 1 vectors obtained by connecting p_{i} to all other points. The set of all these vectors is a rich description of the shape localized at that point but is far too detailed.
The key idea is that the distribution over relative positions is a robust, compact, and highly discriminative descriptor. So, for the point p_{i}, the coarse histogram of the relative coordinates of the remaining n − 1 points,
is defined to be the shape context of .
The bins are normally taken to be uniform in logpolar space.
The fact that the shape context is a rich and discriminative descriptor can be seen in the figure below, in which the shape contexts of two different versions of the letter “A” are shown.
Now in order for a feature descriptor to be useful, it needs to have certain invariances. In particular it needs to be invariant to translation, scale, small perturbations, and depending on application rotation.
 Translational invariance come naturally to shape context.
 Scale invariance is obtained by normalizing all radial distances by the mean distance between all the point pairs in the shape although the median distance can also be used.
 Shape contexts are empirically demonstrated to be robust to deformations, noise, and outliers^{[4]} using synthetic point set matching experiments.
One can provide complete rotation invariance in shape contexts.
 One way is to measure angles at each point relative to the direction of the tangent at that point (since the points are chosen on edges).
 This results in a completely rotationally invariant descriptor.
 But of course this is not always desired since some local features lose their discriminative power if not measured relative to the same frame.
 Many applications in fact forbid rotation invariance
A complete system that uses shape contexts for shape matching consists of the following steps:
 Randomly select a set of points that lie on the edges of a known shape and another set of points on an unknown shape.
 Canny edge detector and picking a random set of points from the edges
 Compute the shape context of each point found in step 1.
 Match each point from the known shape to a point on an unknown shape.
 To minimize the cost of matching, first choose a transformation (e.g. affine, thin plate spline, etc.) that warps the edges of the known shape to the unknown (essentially aligning the two shapes).
 Then select the point on the unknown shape that most closely corresponds to each warped point on the known shape.
 Consider two points p and q that have normalized Kbin histograms (i.e. shape contexts) g(k) and h(k):
 As shape contexts are distributions represented as histograms, it is natural to use the χ^{2} test statistic as the “shape context cost” of matching the two points:, the values of this range from 0 to 1.
 an extra cost based on the appearance can be added: For instance, it could be a measure of tangent angle dissimilarity (particularly useful in digit recognition):
 Now the total cost of matching the two points could be a weightedsum of the two costs:
 for each point p_{i} on the first shape and a point q_{j} on the second shape, calculate the cost as described and call it C_{i,j}. This is the cost matrix.
 Calculate the “shape distance” between each pair of points on the two shapes.
 Use a weighted sum of the shape context distance, the image appearance distance, and the bending energy (a measure of how much transformation is required to bring the two shapes into alignment).
 a transformation can be estimated to map any point from one shape to the other, There are several choices for this transformation:

Affine
The affine model is a standard choice: .
 The thin plate spline (TPS) model is the most widely used model for transformations when working with shape contexts. A 2D transformation can be separated into two TPS function to model a coordinate transform:

 Now, a shape distance between two shapes and . This distance is going to be a weighted sum of three potential terms:
 Shape context distance: this is the symmetric sum of shape context matching costs over best matching points:
where T(·) is the estimated TPS transform that maps the points in Q to those in P.
 Appearance cost: After establishing image correspondences and properly warping one image to match the other, one can define an appearance cost as the sum of squared brightness differences in Gaussian windows around corresponding image points:
where and are the graylevel images ( is the image after warping) and is a Gaussian windowing function.
 Transformation cost: The final cost measures how much transformation is necessary to bring the two images into alignment. In the case of TPS, it is assigned to be the bending energy.
 Shape context distance: this is the symmetric sum of shape context matching costs over best matching points:
 To identify the unknown shape, use a nearestneighbor classifier to compare its shape distance to shape distances of known objects.
 Now that we have a way of calculating the distance between two shapes, we can use a nearest neighbor classifier (kNN) with distance defined as the shape distance calculated here.