Gpu API call error (out of memory) in mallocPitch
Hi, I compiled OpenCV 2.4.9 with cmake and WITH_CUDA=ON and I get the following error on my Late 2009 Mac Book Pro (GPU 9400M) with Snow Leopard. gpu-z does not find my cuda gpu... but OpenCV does! OpenCV finds the GPU 9400m but does not run my test code... but I found posts with people that made this work! What am I doing wrong? I have been trying to solve this problem for days now. Any help would be much appreciated!
#### error ###############
OpenCV Error: Gpu API call (out of memory) in mallocPitch, file /Users/michael/Documents/OpenCV-2.4.2/modules/core/src/gpumat.cpp, line 1276
terminate called after throwing an instance of 'cv::Exception'
what(): /Users/michael/Documents/OpenCV-2.4.2/modules/core/src/gpumat.cpp:1276: error: (-217) out of memory in function mallocPitch
Abort trap
#### Code #########################
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <numeric>
#include <opencv2/core/core.hpp>
#include <opencv2/gpu/gpu.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/contrib/contrib.hpp>
#include <opencv2/imgproc/imgproc.hpp>
using namespace std;
using namespace cv;
int main(int, char**)
{
gpu::printCudaDeviceInfo(cv::gpu::getDevice());
Mat src2, dst;
Mat src = imread("file.png", CV_LOAD_IMAGE_GRAYSCALE);
gpu::GpuMat edges;
namedWindow("Window", WINDOW_AUTOSIZE);
for(;;)
{
src.copyTo(src2);
gpu::GpuMat frame_gpu(src2);
gpu::GaussianBlur(frame_gpu, edges, Size(7,7), 1.5, 1.5);
edges.download(dst);
imshow("Window", dst);
if(waitKey(30) >= 0) break;
}
return 0;
}
#### CUDA Device Query (Runtime API) #######################
Device count: 1
Device 0: "GeForce 9400M"
CUDA Driver Version / Runtime Version 4.10 / 4.10
CUDA Capability Major/Minor version number: 1.1
Total amount of global memory: 254 MBytes (265945088 bytes)
( 2) Multiprocessors x ( 8) CUDA Cores/MP: 16 CUDA Cores
GPU Clock Speed: 1.10 GHz
Memory Clock rate: 1062.50 Mhz
Memory Bus Width: 128-bit
Max Texture Dimension Size (x,y,z) 1D=(8192), 2D=(65536,32768), 3D=(2048,2048,2048)
Max Layered Texture Size (dim) x layers 1D=(8192) x 512, 2D=(8192,8192) x 512
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Concurrent copy and execution: No with 0 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: Yes
Support host page-locked memory mapping: Yes
Concurrent kernel execution: No
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): No
Device PCI Bus ID / PCI location ID: 2 / 0
Compute Mode:
Default (multiple host threads can use ::cudaSetDevice() with device simultaneously)
#### copencv config ###########
-- General configuration for OpenCV 2.4.9 =====================================
-- Version control: commit:b7b32e7
--
-- Platform:
-- Host: Darwin 10.8.0 ...
On which iteration of the
for
loop you're getting the error? Is it possible that you have a memory leak?Kirill, Thanks for your reply. Sorry for my late answer, I could not work on this project for a while.
The loop exits on the first iteration when it tries to load the mat to the gpu.
Cuda 5.0.36 and Cudadriver 5.0.45 are now working on my late 2009 Mac Book Pro after I installed Mountain Lion 10.8.3. All Cuda 1.1. Nvidia SDK demos are working.