I've successfully compiled OpenCV 3.0 with CUDA 7 with Visual Studio 2013 makeing the solution with CMake for x64 architecture. I've notice that my *300.dll performances are slower than the already compiled *249.dll for x86 architecture (downloaded from opencv.org). For example the same sobel test program run with 2.4.9 at 30msec while run with 3.0 at 280msec O_O. How could it be possible? Have I missing some optimization or building options?
Thats my Flag for Release:
//Flags used by the compiler during release builds.
CMAKE_C_FLAGS_RELEASE:STRING=/MD /O2 /Ob2 /D NDEBUG
Here Here's my CMAKE Output
General configuration for OpenCV 3.0.0 =====================================
Version control: unknown
Platform:
Host: Windows 6.2 AMD64
CMake: 3.3.0-rc4
CMake generator: Visual Studio 12 2013 Win64
CMake build tool: C:/Program Files (x86)/MSBuild/12.0/bin/MSBuild.exe
MSVC: 1800
C/C++:
Built as dynamic libs?: YES
C++ Compiler: C:/Program Files (x86)/Microsoft Visual Studio 12.0/VC/bin/x86_amd64/cl.exe (ver 18.0.31101.0)
C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /EHa /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /wd4251 /wd4324 /MP8 /MD /O2 /Ob2 /D NDEBUG /Zi
C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /EHa /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /wd4251 /wd4324 /MP8 /D_DEBUG /MDd /Zi /Ob0 /Od /RTC1
C Compiler: C:/Program Files (x86)/Microsoft Visual Studio 12.0/VC/bin/x86_amd64/cl.exe
C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /MP8 /MD /O2 /Ob2 /D NDEBUG /Zi
C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /MP8 /D_DEBUG /MDd /Zi /Ob0 /Od /RTC1
Linker flags (Release): /machine:x64 /INCREMENTAL:NO /debug
Linker flags (Debug): /machine:x64 /debug /INCREMENTAL
Precompiled headers: YES
Extra dependencies: comctl32 gdi32 ole32 setupapi ws2_32 vfw32 cudart nppc nppi npps cublas cufft
3rdparty dependencies: zlib libjpeg libwebp libpng libtiff libjasper IlmImf ippicv
OpenCV modules:
To be built: hal cudev core cudaarithm flann imgproc ml video cudabgsegm cudafilters cudaimgproc cudawarping imgcodecs photo shape videoio cudacodec highgui objdetect ts features2d calib3d cudafeatures2d cudalegacy cudaobjdetect cudaoptflow cudastereo stitching superres videostab
Disabled: world
Disabled by dependency: -
Unavailable: java python2 python3 viz
Windows RT support: NO
GUI:
QT: NO
Win32 UI: YES
OpenGL support: NO
VTK support: NO
Media I/O:
ZLib: build (ver 1.2.8)
JPEG: build (ver 90)
WEBP: build (ver 0.3.1)
PNG: build (ver 1.5.12)
TIFF: build (ver 42 - 4.0.2)
JPEG 2000: build (ver 1.900.1)
OpenEXR: build (ver 1.7.1)
GDAL: NO
Video I/O:
Video for Windows: YES
DC1394 1.x: NO
DC1394 2.x: NO
FFMPEG: YES (prebuilt binaries)
codec: YES (ver 55.18.102)
format: YES (ver 55.12.100)
util: YES (ver 52.38.100)
swscale: YES (ver 2.3.100)
resample: NO
gentoo-style: YES
OpenNI: NO
OpenNI PrimeSensor Modules: NO
OpenNI2: NO
PvAPI: NO
GigEVisionSDK: NO
DirectShow: YES
Media Foundation: NO
XIMEA: NO
Intel PerC: NO
Other third-party libraries:
Use IPP: 8.2.1 [8.2.1]
at: C:/Users/fabio/Desktop/opencv-3.0.0/3rdparty/ippicv/unpack/ippicv_win
Use IPP Async: NO
Use Eigen: NO
Use TBB: NO
Use OpenMP: NO
Use GCD NO
Use Concurrency YES
Use C=: NO
Use pthreads for parallel for:
NO
Use Cuda: YES (ver 7.0)
Use OpenCL: YES
NVIDIA CUDA
Use CUFFT: YES
Use CUBLAS: YES
USE NVCUVID: NO
NVIDIA GPU arch: 20 21 30 35
NVIDIA PTX archs: 30
Use fast math: NO
OpenCL:
Version: dynamic
Include path: C:/Users/fabio/Desktop/opencv-3.0.0/3rdparty/include/opencl/1.2
Use AMDFFT: NO
Use AMDBLAS: NO
Python 2:
Interpreter: NO
Python 3:
Interpreter: NO
Python (for build): NO
Java:
ant: NO
JNI: C:/Program Files/Java/jdk1.8.0_11/include C:/Program Files/Java/jdk1.8.0_11/include/win32 C:/Program Files/Java/jdk1.8.0_11/include
Java wrappers: NO
Java tests: NO
Matlab:
mex: NO
Tests and samples:
Tests: YES
Performance tests: YES
C/C++ Examples: NO
Install path: C:/Users/fabio/Desktop/buildCuda/install
cvconfig.h is in: C:/Users/fabio/Desktop/buildCuda
Configuring done