How to marshal OpenCV objects
Hello,
I am working on a project where we need to marshal python objects at runtime, without knowing in advance the types of the objects that need to be saved to disk (and then loaded back in a different python context). We are using callbacks based on the object types, whenever we need specialised functions (e.g. using hdf5 format for Tensorflow models), otherwise falling back to dill
.
I am having a hard time finding a general way to save openCV objects. In my specific case, I am trying to save to disk a CascadeClassifier
object, but I notice that it implements just load
and road
and not write
or save
. What can I do? I noticed that some other cv objects implement the save
or write
function. Why doesn't CascadeClassifier
implement it as well?
And more generally, what is the best approach to save OpenCV objects to disk and the load them back? Thanks!!
why would you save something immutable (like a cascade) ?
(and no, this is c++ code, basically. you don't "marshal" or pickle objects. you create them, and (re)load the data)
@berak I made the
CascadeClassifier
example because that is what we bumped into. We need to be able to potentially marshal any python object, from one Python context to another. So the question in general is, what is the best we can do with open cv objects? What are the interfaces that are available to save things?you can't. it's not a "pure python" lib, but c++ code with python wrappers
(and you can't marshal the underlying c++ objects)
@berak Ok I understand your argument and I agree completely. There is no way of saving a general C++ object, without a specific save/load interface. Since this is critical to our efforts in building a data science platform, I would like to ask further a couple of questions. I would really appreciate if you could help me in getting a better understanding of this.
I see that many open cv objects support the
save
/load
interface, to save the state of an object to xml/yaml and then load it back (see for exampleFaceRecognizer
: https://docs.opencv.org/2.4/modules/c...). Whare are the objects that support this interface? Why isn't it all of them? Why isn't this the case forCascadeClassifier
?Also, other libraries like TensorFlow provide a generic
save
/load
interface for any tensor, model, or other objects (https://www.tensorflow.org/tutorials/...), that are actually C++ objects down the line. I understand that OpenCV does not support this generic paradigm, is there a specific reason?Lastly, the
CascadeClassifier
example I am working on was taken from here (https://github.com/thomasgrusz/dog-br...). Specifically, they are creating aCascadeClassifier
using the xml filehaarcascades/haarcascade_frontalface_alt.xml
(cv2.CascadeClassifier(haarcascade_frontalface_alt.xml)
). I also see thatCascadeClassifier
implements theload
method to load these xml files, but how are they produced in the first place? I would expect to be able towrite
as well at this point.the cascades are trained from a suite of external programs (the CascadeClassifier is a read-only application)
similar problem with opencv's dnn -- you're expected to use some external framework, like tf, pytorch to train & save it, then load it into dnn::Net for inference
imho, the only classes that work like you expect it are the cv::ml models, which produce a serialized class state (e.g. an instance
Ptr<SVM>
from theload()
method)