openCV 4.2 with CUDA 10.2
Is new OpenCV version copatibile with CUDA 10.2? I want to use openCV with CUDA for darknet (https://github.com/AlexeyAB/darknet). Can I compile it without any worries?
Is new OpenCV version copatibile with CUDA 10.2? I want to use openCV with CUDA for darknet (https://github.com/AlexeyAB/darknet). Can I compile it without any worries?
On windows yes. I just compiled a debug version with ninja and Visual Studio 2019 and confirmed its working by running
"%openCvBuild%\install\x64\vc16\bin\opencv_perf_cudaarithm.exe" --gtest_filter=Sz_Type_Flags_GEMM.GEMM/29
from this guide for building 4.2.0 with the following output
[----------]
[ INFO ] Implementation variant: cuda.
[----------]
[----------]
[ GPU INFO ] Run test suite on GeForce RTX 2080 GPU.
[----------]
Time compensation is 0
[----------]
[ GPU INFO ] Run on OS Windows x64.
[----------]
*** CUDA Device Query (Runtime API) version (CUDART static linking) ***
Device count: 1
Device 0: "GeForce RTX 2080"
CUDA Driver Version / Runtime Version 10.20 / 10.20
CUDA Capability Major/Minor version number: 7.5
Total amount of global memory: 8192 MBytes (8589934592 bytes)
GPU Clock Speed: 1.59 GHz
...
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.20, CUDA Runtime Version = 10.20, NumDevs = 1
TEST: Skip tests with tags: 'mem_6gb', 'verylong', 'debug_verylong'
CTEST_FULL_OUTPUT
OpenCV version: 4.2.0-dev
OpenCV VCS version: 4.2.0-1-g89d3f95a8e
Build type: Debug
Compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.23.28105/bin/Hostx64/x64/cl.exe (ver 19.23.28106.4)
Parallel framework: tbb
CPU features: SSE SSE2 SSE3 *SSE4.1 *SSE4.2 *FP16 *AVX *AVX2 *AVX512-SKX?
Intel(R) IPP version: ippIP AVX2 (l9) 2019.0.0 Gold (-) Jul 26 2018
....
Note: Google Test filter = Sz_Type_Flags_GEMM.GEMM/29
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Sz_Type_Flags_GEMM
[ RUN ] Sz_Type_Flags_GEMM.GEMM/29, where GetParam() = (1024x1024, 32FC2, 0|cv::GEMM_1_T)
[ PERFSTAT ] (samples=13 mean=2.03 median=2.03 min=1.95 stddev=0.04 (2.0%))
[ OK ] Sz_Type_Flags_GEMM.GEMM/29 (409 ms)
[----------] 1 test from Sz_Type_Flags_GEMM (411 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (415 ms total)
[ PASSED ] 1 test.
Asked: 2019-12-23 13:13:19 -0600
Seen: 5,328 times
Last updated: Jan 06 '20
Building OpenCV 4 requires CC 5.3 or higher - RTX 2080TI
Is there any limit on maximum number of faces detected using DNN face detector?
Issues compiling opencv-3.4.8 with cuda-10.2 in Ubuntu
CUDA::remap with shared-memory -> black output
Development environment and process for small devices like Pi Zero W
OpenCV has got a CUDA backend for the DNN module. Please have a look at this benchmark. If you tell me the model you plan on using darknet for, I can tell you whether it's supported in the OpenCV CUDA DNN backend (and also provide benchmarks comparing the two).
I will use most likely yolov3. I've already setup CUDA 10.1, cudnn 7.6.5 with opencv 4.1.0. Is it worth to upgrade? What is the best configuration for YOLO training on 1080ti and then for using it with python on jetson TX2?
OpenCV cannot train models but it can perform inference slightly faster than darknet. The heavier the model, the larger is the margin by which OpenCV's CUDA backend outperforms darknet. There are few open PRs and planned PRs which could make OpenCV's CUDA backend around 1.5x faster than darknet for fp32 inference. It's already 2x+ faster for half-precision inference.