Opencv2.1 has peopledetect.cpp that come with Dalal default detector. It does not provide the details how the detector can be trained.
Theory: Dalal and Trig HOG must be computed first. After that, we use the SVM to classify the people/no people. With the fault positive, retrain the system once again.
Accuracy on test set: 98.90% (14447 correct, 161 incorrect, 14608 total)
Precision/recall on test set: 99.04%/94.25%
Implement: Opencv with the HOGDescriptor::compute, one can compute the HOG of the given image. Requirement is to train the SVM with Train/negative, Train/Positive image set. After that, test the SVM model with the Test/negative. Any detection is therefore, fault positive. Retrain the fault positive with the training set again.
The trained SVM model is a file containing support vectors. With the support vectors, one can use them to predict people/non-people classification. Opencv, However, use only one vector to detect people.
This vector is the weight vector as shown in picture above as w.
1. Using the INRIA person dataset ,
for positive train test set.
2. Prepare the 64×128 images for those picture. For negative image, it is required that the 64×128 randomly cropped from the original image.
3. HogCompute parameter. We will compute the HOG using Opencv HOGDescriptor::compute with the following default parameters.
blockSize =(16,16) blockStride =(8,8) CellSize =(8,8) derivAperture bool =0 gammaCorrection bool=0 histogramNormType =0 ifhistogramNormType = 0 means L2Hys L2HysTheshold =0.2 nbins =9
winSigma =-1 winSize =(64,128)
5. Compute HOG, train the SVM, test the negative test set with the trained model, put the fault positive to retrain again. (its called hard train)
6. Compute the weight vector of the hard train and use it as vectorgetDefaultPeopleDetector() the trig is we need to put the theshold value from the model added into the last vector position.
7. At detectMultiScale function, change the threshold values,group threshold and hit threshold until u get good result.
Discussion: Manually change the threshold value does not seems to be “computer vision” it is human vision. How can I auto change the threshold value and get good result?
Download: Here I created a project to host all above training. License: Its for free to everything.
My request toward reader: I still do not know how the Det graph and performance measurement can be implemented. Any help will be appreciated.
- MIT Pedestrian Database: http://cbcl.mit.edu/cbcl/software-datasets/PedestrianData.html
- INRIA Toolkit http://pascal.inrialpes.fr/soft/olt/ ; DataSet:http://pascal.inrialpes.fr/data/human/ (There are links to other image databases)
- More INRIA images: http://lear.inrialpes.fr/data
- Fast Alternative Site for INRIA image:http://yoshi.cs.ucla.edu/yao/data/PASCAL_human/