Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

opencv_dnn provides incorrect inferences after transform_graph

I am running into a problem that has been encountered by a few people mentioned in Pull Request #9517.

I basically trained a MobileNet using Tensorflow's retrain.py as described in HackerNoon post Creating insanely fast image classifiers with MobileNet in TensorFlow.

Once the imagery is setup, the network is trained as follows:

#!/bin/bash -xe

TF_ROOT=/home/ubuntu/src/tensorflow/tensorflow
DATA_ROOT=/home/ubuntu/data

python $TF_ROOT/examples/image_retraining/retrain.py \
    --image_dir $DATA_ROOT \
    --learning_rate=0.001 \
    --testing_percentage=20 \
    --validation_percentage=20 \
    --train_batch_size=32 \
    --validation_batch_size=-1 \
    --flip_left_right \
    --random_crop=30 \
    --random_scale=30 \
    --random_brightness=30 \
    --eval_step_interval=100 \
    --how_many_training_steps=2000 \
    --architecture mobilenet_1.0_224

Then graph is then transformed using Tensorflow's transform_graph tool:

~/Development/tensorflow/bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph=mobilenet_1.0_224.pb \
--out_graph=deploynet_1.0_224.pb \
--inputs=input \
--outputs=final_result \
--transforms="fold_constants sort_by_execution_order remove_nodes(op=Squeeze, op=PlaceholderWithDefault)"

The summarize_graph tensorflow tool shows the following output:

Found 1 possible inputs: (name=input, type=float(1), shape=[1,224,224,3]) 
No variables spotted.
Found 1 possible outputs: (name=final_result, op=Softmax) 
Found 4235007 (4.24M) const parameters, 0 (0) variable parameters, and 0 control_edges
Op types used: 86 Const, 28 Add, 27 Mul, 27 Relu6, 15 Conv2D, 13 DepthwiseConv2dNative, 1 AvgPool, 1 BiasAdd, 1 Identity, 1 MatMul, 1 Placeholder, 1 Reshape, 1 Softmax
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --graph=/home/wlucas/Temp/dnn/deploynet_1.0_224.pb --show_flops --input_layer=input --input_layer_type=float --input_layer_shape=1,224,224,3 --output_layer=final_result

Testing the inference engine after these adjustments generally shows a 99% confidence of the second class regardless of the correct class presented (with the exception of training imagery); whereas, Tensorflow Mobile correctly infers the image as expected.

Any thoughts or help would be greatly appreciated!