Ask Your Question
3

What is the HOG descriptor's shape?

asked 2016-01-30 23:11:17 -0600

boone gravatar image

I'm dipping my toe in pedestrian detection with OpenCV using Histogram of Oriented Gradients. I'd like to understand the descriptor better (I banged out a quick visualizer in pyplot), but I'm having trouble figuring out the output data structure. It's a very long 1D array... great for machine learning, not so easy for a human to understand.

Here is my configuration, in OpenCV in Python. "img" is 64x128 and greyscale.

winSize = (64,128)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = True

hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
                        histogramNormType,L2HysThreshold,gammaCorrection)

winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(img, winStride, padding, locations)

And I get a vector of len 3780 - 7x15 blocks (not 8x16 because of the overlap), 2x2 cells per block, 9 angle bins. Is the shape (7, 15, 2, 2, 9)? Or (2, 2, 7, 15, 9)? Or (14, 30, 9)? Do the angle bins go from 0 to 180 or 180 to 0? Or is a 360 HOG? Does width come first or height?

What is the OpenCV convention?

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
1

answered 2016-02-01 11:53:54 -0600

LorenaGdL gravatar image

OpenCV follows the convention established by the INRIA Object Detection and Localization Toolkit. If you don't want to dive into the source code, I'm sure you'll find plenty of info about OLT

edit flag offensive delete link more

Question Tools

1 follower

Stats

Asked: 2016-01-30 23:08:47 -0600

Seen: 2,036 times

Last updated: Jan 30 '16