Max-Margin Matrix Cosine SimilaritySee more details and examples in the following paper:
Overview
LSK tensor descriptors are projected onto leading principal components to yield decorrelated and discriminatory feature tensors which are then used in a max-margin training framework with the matrix cosine similarity kernel. Owing to the linearity of the kernel, support vectors are combined into a rigid tensor detector for fast and efficient detection. Local Steering Kernel (LSK)
LSK Visualization: First column displays raw infarred images of pedestrians in different poses. HOG and LSK descriptors are displayed in grayscale (second and third column respectively) as well as in colormap (fourth and fifth column respectively). Columns sixth, seventh and eighth show LSK features after projecting the descriptors on three leading principal components. Fast Detection
Faster Search: Multichannel Fourier transform and integral image facilitates exact acceleration of the decision rule. Results
OSU Thermal Dataset [1] The detection scores above the threshold are embedded inside the displayed bounding box. The convention of color map is maintained, i.e., a red bounding box indicates highest confidence and blue bounding box lowest confidence
OSU Color Thermal Dataset [2] (only thermal channels are used) Top row shows multiscale detection on three frames from OSU-CT dataset. The scale best estimated is shown with the appropriate sized bounding box centered at the predicted location. The bottom row heat maps illustrate corresponding decision scores (maximum likelihood estimate across all six scales) obtained from the classifier. The blue regions show less confidence and the red to reddish black shows high to very high confidence in detecting pedestrians. Note we have not used any tracking information and/or background model. Annotations We have annotated the thermal (and color channels as well) for all the sequences for our purpose. Note, the thermal and color image pairs are not registered. Hence, there exist separate annotation files. We have used Piotr Dollar's Computer Vision Toolbox [4-6] for the annotation. The annotation files will be available soon (meanwhile please contact the first author if you need them).
LSI thermal infrared dataset [3] Results show multiscale detection of pedestrians across wide range of scales. The estimated likelihood of pedestrian’s location measured across all the scales is shown under each frame. As before, the dark red to reddish black denotes high to very high confidence of the detector. References
Code
|