Ask Your Question
1

Long delay on cv::gpu::GpuMat::upload after upgrade to GTX970

asked 2015-01-04 07:22:18 -0600

wronglyNeo gravatar image

updated 2015-01-06 02:58:22 -0600

Hello,

I have been using the gpu module (cuda) of OpenCV in my program for a while and it worked fine. Now I upgraded my graphics card to a gtx970. Now, the first time I call cv::gpu::GpuMat::upload after launching the program I get a very long delay. With my old graphics card (GTX770) this completed nearly instantly.

Example: I have an image which is 512x600 pixels in size. With this image it takes 12s. If I execute the same code again afterwards without closing the program it works instantaneously. I know that the first time the CUDA code is executed after launching the program, it is compiled on the GPU, so a certain delay is normal. But to me this appears to be inexplicably long, especially because it was much faster with the old card.

Does anyone know what could cause this behaviour? Are there any known issues of the current OpenCV version in connection with GTX970 cards? The version I am using is 2.4.10 which is, apart from the 3.0beta, the latest one. I compliled OpenCV with CUDA when I still had my old Graphics Card. Could compiling it again with the new one help? (I wouldn't think so)

EDIT:

I now discovered that there is a Release of the CUDA Toolkit that specifically supports GTX970 and GTX980 cards:

https://developer.nvidia.com/cuda-dow...

I downloaded it and compiled OpenCV again with that one. Unfortunately, this didn't solve my problem. Somehow I have got the feeling it takes even longer now.

Is there no one here who has any experiences with GTX900 cards and OpenCV?

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
3

answered 2015-01-07 01:48:59 -0600

wronglyNeo gravatar image

Ok, I figured it out. You have to tell the nvcc compiler to create binary code for the new device generation (compute capability 5.2 instead of 3.0 for the old card). When building the OpenCV project with cmake there is a variable CUDA_ARCH_BIN in the CUDA group that is currently set to 1.1 1.2 1.3 2.0 2.1(2.0) 3.0 3.5 per default. I added 5.2 to the list, generated and compiled again. Now it works fine.

edit flag offensive delete link more

Question Tools

1 follower

Stats

Asked: 2015-01-04 07:22:18 -0600

Seen: 1,813 times

Last updated: Jan 07 '15