1 | initial version |
I would say you have two options and both should solve the problem above:
2 | No.2 Revision |
First, there is no advantage that I know of to processing multiple frames at the same time over efficiently processing one after the other. In order to do so you would either have to process very small images on a large GPU which is generally not an option or alter the block and grid sizes of each CUDA algorithm that you use. At best you may see a marginal speed up from this, however you would have to tweak you code calculating the ideal block and grid size every time you change the image size. It is much better to use streams and let the hardware try to schedule the operations most efficiently.
Given that I would say you have two options and both should solve the problem above:options:
3 | No.3 Revision |
First, there is no advantage that I know of to processing multiple frames at the same time over efficiently processing one after the other. In order to do so you would either have to process very small images on a large GPU which is generally not an option or alter the block and grid sizes of each CUDA algorithm that you use. At best you may see a marginal speed up from this, however you would have to tweak you code calculating the ideal block and grid size every time you change the image size. It is much better to use streams and let the hardware try to schedule the operations most efficiently.
Given that I would say you have two options: