“First, ‘interest points’ are selected at distinctive locations in the image, such as corners, blobs, and T-junctions. The most *valuable property* of an *interest point detector* is its

**repeatability**, i.e. whether it reliably finds the same interest points under different viewing conditions.

Next, the neighbourhood of every interest point is represented by a feature vector. This ** descriptor **has to be distinctive and, at the same time, robust to noise, detection errors, and geometric and photometric deformations.

Finally, the descriptor vectors are *matched* between the vectors, e.g. the Mahalanobis or Euclidean distance. The dimension of this vector has a direct impact on the time this takes, and a lower number of dimensions is therefore desirable.” (Bay, Herbert et al. 2006. SURF: Speeded Up Robust Features)

Match the two vectors by **comparing their distance (Euiclidean or Mahalanobis)**.

A **homography** is a **little 3×3 matrix** that tells us how to map one image (a set of pixel) onto another image. Once we have the key points and descriptors, we can find this matrix.

- Use cvExtractSURF to get key points and descriptors from both images.
- Find matching key points by comparing the distance between the key points. We will use a
**naive nearest neighbor**approach. - Once the pairs of key points are found, stick them into cvFindHomography to get the homography matrix.
- Use the homography to warp one image to the other.

Matching SIFT Using** Nearest Neighbor**

**Matching points** whose ratio (best_match_cost / second_best_match_cost) < 0.7

- A Kdtree is just a way of finding nearby vectors quickly; it has little to do with
*what*is being matched (vectors of numbers representing …), or*how*(Euclidean distance). - Now kdtrees are very fast for 2d, 3d … up to perhaps 20d, but may be no faster than
**a linear scan**of all the data above 20d. So how can a kdtree work for features in 128d ? The main trick is to**quit searching early**. - The paper by Muja and Lowe, Fast approximate nearest neighbors with automatic algorithm configuration, 2009, 10p, describes
**multiple randomized kdtrees**for matching 128d SIFT features. (Lowe is the inventor of SIFT.) - using kd-trees for approximate
**NN search in higher dimensions:**

**RANSAC【 RANdom SAmple Consensus（随机抽样一致性）】**

由Fischler和Bolles提出的一种鲁棒性的参数估计方法。

RANSAC 的基本思想是在进行参数估计时，不是不加区分地对待所有可用的输入数据，而是首先针对具体问题设计出一个目标函数，然后迭代地估计该函数的参数值，利用这些初始参数值把所有的数据分为所谓的**“内点”(Inliers**，即满足估计参数的点)和**“外点”(Outliers**，即不满足估计参数的点)，最后反过来用**所有的“内点”重新计算**和估计函数的参数。

• Choose a small subset of points uniformly at random

• Fit a model to that subset a model to that subset

• Find all remaining points that are “close” to the model and reject the rest as outliers

• Do this many times and choose the **best model** this many times and choose the best model

Transform all the template Points to the target image – **Affine Transformation**

Point (x,y) is mapped to (u,v) by the** linear** function:

*u = a x + b y + c *

*v = c x + d y + e*

Mtalab code:

t = cp2tform(src_points, target_points, ‘affine’);

**Other 2D transformations:**

**Similarity(translation, scale, rotation)****Homography**: Fitting a plane projective transformation that- between two views of a planar surface;
- between images from two cameras that share the same center

**Hough transform**

We want to find a template defined by its **reference point (center)** and **several distinct types of landmark points** in stable spatial configuration.

**Detecting the template:**

For each feature in a new image, look up that **feature type** in the **Model look up** that feature type in the model and vote for the possible center locations associated with that type in the model.

**Application in recognition**

**Implicit shape models: Training**

- Build codebook of patches around extracted interest points using clustering
- Map the patch around each interest point to closest codebook entry
- For each codebook entry, store all positions it was found, relative to object center
- Extract weighted segmentation mask based on weighted segmentation mask based on stored masks for the codebook occurrences