The photo above plottes each image with all recognized positives, colored based on confidence.
The method can be summarized as below:
1. you should collect several images that do not contain any positive sample (e.g. 1000 images). 2. randomly crop some windows (e.g 10 window per image) from negative samples and re-size them to your HOG search window (e.g. 64x128). 3. use them as negatives to train your system. 4. After you build the training vector, loop it around all of your preliminary negative images (first 1000 images). 5. obviously, any detected windows is false alarm. so use them in addition to previous randomly selected windows to retrain your detector. this can dramatically boost your detector's performance. but do not repeat it for second time, because for the second time, the improvement will be negligible.
Below is a rough outline of the training pipeline:
- We sample our initial training data for positive (faces) and negative (non-faces) patches.
- We extract feature vectors from the initial training patches.
- We train a linear SVM using the initial feature vectors. This is a very rough classifier that we can use for facial detection. We will swap this out with a non-linear RBF SVM for comparisons.
- In order to improve our classifier, we run our classifier on images with no faces. The results are all false positives, which we use to re-train our classifier. We repeat this process a few number of times.
- We run our classifier on the testing set, and look at the results.
Mining Hard Negatives
The goal of this step is to lower the generalization error of the SVM by providing it more points.