Target Parameter Set It is favorable to work with as few as possible parameters deﬁning the target state. Small amount of parameters that can still describe well the dynamics of the target will tend to converge faster, and therefore fewer hypotheses would be needed in each step. — HoG-Model based Human Tracking and Irregular Behavior Detection
The size of the object bounding box may vary greatly as well. Objects close to the camera would occupy more pixels than ones further away. However the aspect ratio of the bounding box is assumed to be constant, therefore it is possible to treat it with one parameter – width, rather than two (width and height). The walking appearance in the HoG feature manifold is treated with one parameter – sample index. In each iteration we look for a manifold sample of a chosen distance and direction from the current sample, therefore one dimensional sample index is a sufﬁcient descriptor, if the neighborhood graph of the HoG features of the training data is known.
Blocks should better be vary in size, location, and aspect ratio.
Just use the trained classifier to convolve with the HoG feature pyramid at multiple scales for each image, to evaluate all the sub-windows of each image. The number of pyramid levels controls the size of possible detected windows in the image.
HOG VS SIFT
HOG的思路正如paper所言来源于SIFT和Shape Context，将SIFT的sparse feature应用到dense feature，同时具有了目标的表象和形状（appearance and shape）。
R- HOG跟SIFT描述器看起来很相似，但他们的不同之处是：R-HOG是在单一尺度下、密集的网格内、没有对方向排序的情况下被计算出来（are computed in dense grids at some single scale without orientation alignment）；而SIFT描述器是在多尺度下、稀疏的图像关键点上、对方向排序的情况下被计算出来（are computed at sparse scale-invariant key image points and are rotated to align orientation）。补充一点，R-HOG是各区间被组合起来用于对空域信息进行编码（are used in conjunction to encode spatial form information），而SIFT的各描述器是单独使用的（are used singly）。