You will first need an algorithm to detect your object. Without images or information on your end, this can be tons of different solutions, so give us something here.
Then you need to extract the bounding box and perform some kind of object segmentation. CNN's seem to be doing quite well there.
Then you need to adapt your template (new object) to the size of your bounding box
Perform image merging with opacity to paste the new object into the image