Cannot use Tensorflow model with batch normalization [closed]
I have a simple convolution network model made with Keras and Tensorflow 1.14 The model is saved as constant graph in binary .pb format
The model loads successfully but the calculations are not correct after the first batch norm layer
I am using OpenCV 3.4
Anyone encountered or heard a similar problem?
If you really want to go for deep learning in OpenCV, i so suggest using latest master 4.x branch. There are like daily fixes on these things, so 3.4 will probably be heavily outdated...
i'm getting similar problems with pytorch->onnx->dnn with 4.1.0.
a simple conv/bn/relu/pool is highly inaccurate with a bn in it, and ok with bn removed.
https://gist.github.com/berak/43ad415...
solved my problem:
model.eval()
needs to be called before saving the onnx, to put it from "train" ito "evaluation" mode, similar to "freezing" a tf network.I also have bn as a second layer after conv. The outputs are quite different in OpenCV comparing to Tensorflow. There was something strange. I saw the Keras bn layer is done by several nodes in Tensorflow but in OpenCV I see only one layer named fused_batchnorm.
Feel free to open an issue providing steps to reproduce it (attach the model). We observed several times buggy Keras batch normalization - it does not switch between training nd testing mode properly. So if the latest master or the latest 3.4 branches produce wrong results - let's investigate if together without woodoo debugging but with reproducible reports. Thanks!
Will do. I will have to verify which version I am using and gather the relevant information.
I think there is already an issue on the topic here. Have you frozen the graph_def file like described in the comment?
Yes I used tf.keras.backend.set_learning_phase(0) before loading the model from the keras saved file then used tf.graph_util.convert_variables_to_constants(...)
Added the first part of the model (up to the batchnorm) and some test data in the issue https://github.com/opencv/opencv/issu...
I found the problem with the help of dkurt. Seems the BatchNorm was configured wrongly for channels first data format instead of channels last.