the more, the better. also, with a multi-class SVM (or such), there are no "negative" images. (for class 1, those from classes 2...N will be used as negatives). if you train a classifier from scratch, you will need a ton of images, while re-using a pretrained one (transfer learning) can be done with a few dozen images per class.
30 x 30 x 3 = 2700. that's ok. again, the larger the features get, the more training data you will need to seperate them.
maybe. however, start with an easy SVM (opencv's ml classes can be simply swapped). once you have that running, you could wire in some pretrained cnn (like squeezenet) for the transfer learning (it would simply mean: cut off the last few "identification" layers from the cnn, and use it as a "fixed" preprocessing pipeline, -- image in, feature out. then you classify those "features" instead of using the images directly.)
remember, there are other dl frameworks, like tf,caffe,torch,darknet, etc.