文章/答案/技术大牛

发布

社区首页 >问答首页 >Tflite推理比keras模型推理慢得多。

问Tflite推理比keras模型推理慢得多。
EN

Stack Overflow用户

提问于 2021-03-02 17:18:54

回答 1查看 856关注 0票数 1

我把角兰花的模型改成了斜纹。我把模型转换成这样

from keras import backend as K
from keras.models import load_model
from keras.engine.base_layer import Layer
import tensorflow as tf
# This line must be executed before loading Keras model.
K.set_learning_phase(0)

# custom layer
class Mish(Layer):
    '''
    Mish Activation Function.
    .. math::
        mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + e^{x}))
    Shape:
        - Input: Arbitrary. Use the keyword argument `input_shape`
        (tuple of integers, does not include the samples axis)
        when using this layer as the first layer in a model.
        - Output: Same shape as the input.
    Examples:
        >>> X_input = Input(input_shape)
        >>> X = Mish()(X_input)
    '''

    def __init__(self, **kwargs):
        super(Mish, self).__init__(**kwargs)
        self.supports_masking = True

    def call(self, inputs):
        # return inputs * K.tanh(K.softplus(inputs))
        # return inputs * tf.tanh(tf.log(1 + tf.exp(inputs)))
        return inputs * K.tanh(K.log(1 + K.exp(inputs)))

    def get_config(self):
        config = super(Mish, self).get_config()
        return config

    def compute_output_shape(self, input_shape):
        return input_shape

model = load_model('./keras_model/yolo4.h5', custom_objects={"Mish":Mish})


def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
    from tensorflow.python.framework.graph_util import convert_variables_to_constants
    graph = session.graph
    with graph.as_default():
        freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
        output_names = output_names or []
        output_names += [v.op.name for v in tf.global_variables()]
        # Graph -> GraphDef ProtoBuf
        input_graph_def = graph.as_graph_def()
        if clear_devices:
            for node in input_graph_def.node:
                node.device = ""
        frozen_graph = convert_variables_to_constants(session, input_graph_def,
                                                        output_names, freeze_var_names)
        return frozen_graph

frozen_graph = freeze_session(K.get_session(),
                              output_names=[out.op.name for out in model.outputs])


tf.train.write_graph(frozen_graph, "frozen", "tf_model_l0.pb", as_text=False)

converter = tf.lite.TFLiteConverter.from_frozen_graph('frozen/tf_model_l0.pb', 
            input_arrays=['input_1'], 
            output_arrays=["conv2d_110/BiasAdd","conv2d_102/BiasAdd","conv2d_94/BiasAdd"]  
        )

tfmodel = converter.convert() 
open ("model5.tflite" , "wb").write(tfmodel)

上面一个是转换脚本。在推理的时候，我处理的是同样的预处理，这些预处理是在角点推理中使用的。这是tflite推理代码

# load tflite model
babyNet_lite = tf.lite.Interpreter(model_path=model_path)
# allocate tensors
babyNet_lite.allocate_tensors()

input_details = babyNet_lite.get_input_details()
output_details = babyNet_lite.get_output_details()

# image reading
img = cv2.imread("test.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (416, 416))
img = img.astype(np.float32) / 255.
img = np.expand_dims(img, axis=0)

babynet.set_tensor(input_details[0]['index'], img)
# run the inference
babynet.invoke()
# output data
outs = []
outs.append(babynet.get_tensor(output_details[0]['index']))
outs.append(babynet.get_tensor(output_details[1]['index']))
outs.append(babynet.get_tensor(output_details[2]['index']))

我正在用tflite得到精确的结果。但是处理1帧需要很长时间。在keras模型中，每帧推理时间为1.0110秒。但是现在在tflite推论中，它是每帧7.560秒。

之后，我用下面的代码将模型量化为float16。

# float16
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_fp16_model = converter.convert()
tflite_model_fp16_file = "model_quant_f16.tflite"
open (tflite_model_fp16_file , "wb").write(tflite_fp16_model)

然后我检查了推理时间。现在每帧显示大约2.100秒。型号尺寸从256 mb减少到128 mb。准确性也是一样的。但是，推理时间仍然比keras模型推理要长。我在哪里搞错了？

不知道我犯了什么错。我的keras模型推理是1秒/帧，而相同转换的tflite模型推断是每秒2秒。我只使用CPU系统。Tensorflow版本为1.15.2。Keras版本为2.3.1。在转换为tflite后，在推理时没有获得任何性能速度。

tensorflow

machine-learning

keras

deep-learning

tensorflow-lite

回答 1

Stack Overflow用户

发布于 2022-07-28 12:46:37

我知道这并不能直接回答你的问题，但如果你想找一种更快的推断方法，我建议你试试OpenVINO。OpenVINO是为英特尔的硬件优化，但它应该与任何CPU一起工作。它通过图的剪枝或将某些操作融合在一起来优化推理性能。这里是Keras/Tensorflow模型的性能基准。

您可以找到关于如何转换Keras模型这里的完整教程。下面是一些片段。

安装OpenVINO

最简单的方法是使用PIP。或者，您也可以使用此工具找到最佳的解决方案。

pip install openvino-dev[tensorflow2]

将您的模型保存为SavedModel

OpenVINO无法转换HDF5模型，因此必须首先将其保存为SavedModel。

import tensorflow as tf
from custom_layer import CustomLayer
model = tf.keras.models.load_model('model.h5', custom_objects={'CustomLayer': CustomLayer})
tf.saved_model.save(model, 'model')

使用模型优化器转换SavedModel模型

模型优化器是来自OpenVINO开发包的命令行工具。它将Tensorflow模型转换为IR，这是OpenVINO的默认格式。您还可以尝试FP16的精度，这将使您在不降低精度的情况下获得更好的性能(只需更改data_type)。在命令行中运行：

mo --saved_model_dir "model" --input_shape "[1, 3, 224, 224]" --data_type FP32 --output_dir "model_ir"

运行推理

转换后的模型可以由运行时加载，并为特定的设备进行编译，例如CPU或GPU (集成到CPU中，比如Intel HD Graphics)。如果你不知道什么是你最好的选择，只需使用汽车。

# Load the network
ie = Core()
model_ir = ie.read_model(model="model_ir/model.xml")
compiled_model_ir = ie.compile_model(model=model_ir, device_name="CPU")

# Get output layer
output_layer_ir = compiled_model_ir.output(0)

# Run inference on the input image
result = compiled_model_ir([input_image])[output_layer_ir]

免责声明:我在OpenVINO工作。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/66444101

复制

相似问题

问Tflite推理比keras模型推理慢得多。
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Tflite推理比keras模型推理慢得多。EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Tflite推理比keras模型推理慢得多。
EN