Object localization is an important task for the automatic understanding of images as well, e.g. to separate an object
from the background, or to analyze the spatial relations of different objects in an image to each other.
The target is to find a bounding box around the object. it is much easier to provide ground truth annotation for bounding boxes than pixel-wise segmentations.
In the field of object localization with bounding boxes, sliding window approaches have been the method of choice
for many years. they rely on
- As positive training examples for the SVM we use the ground truth bounding boxes that are provided with the dataset.
- As negative examples we sample box regions from images with negative class label and from locations outside of the object region in positively labeled images. (blindly)
Object classification is to judge whether an object is presented or not. Nothing about the location and scale.
What does it mean, to see? The plain man’s answer would be, to know what is where by looking.
David Marr sums up the holy grail of vision: discovering what is present in the world, and where it is, from unlabeled images.