gpu::convolve and gpu::filter2D vs cv::filter2D, opencv 2.4.9
I'm trying to use the opencv gpu module to filter an image with Gabor kernels. To check if everything is correct, I'm comparing the result of the CUDA accelerated filtering, with the regular CPU filtering. The code I'm using is available here: https://github.com/juancamilog/gpu_convolve_test.git
Since the funtcion cv::gpu::filter2D is limited to kernels of size smaller than 16x16, I'm using cv::gpu::convolve for larger kernels. In that case, I use cv::gpu::copyMakeBorder to produce a filter response that has the same size as the original image.
The problem I'm facing is that the result of the cv::gpu::convolve function is different from the result of cv::filter2D. What is the cause of this difference? How do we obtain a GPU filtering response that is the same as the CPU filtering response?
UPDATE Some example results for the cv::gpu::filter2D case (no difference):
CPU
GPU
And for the cv::gpu::convolve case:
CPU
GPU
I'm starting to think that this happens because the cuda FFT API is expecting a kernel centered at (0,0): http://developer.download.nvidia.com/compute/cuda/2_2/sdk/website/projects/convolutionFFT2D/doc/convolutionFFT2D.pdf. I can change the code to generate a kernel that has the same size as the image, shifted so the center is at pixel coordinate (0,0), perform the convolution with gpu::dft and gpu::mulSpectrums, and crop the result. Then the result is the expected one, although the computation time is much worse Maybe the fix is to modify the gpu::convolve code so that each block is convolved with a kernel centered at (0,0)?
Did you ever figure out how to make this work? Seems like a pretty huge issue. If nothing else, it needs better documentation.