量化意识的培训在坦索弗洛允许我量化使用不同的量化配置的个人水平使用tensorflow_model_optimization.quantization.keras.quantize_annotate_layer。我想对一个已经受过训练的模特产生类似的效果。
在Tensorflow的训练后量化文档中,下面是一个将模型量化为float16的示例。
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_quant_model = converter.convert()然而,我相信这量化了所有模型层的激活和权重。在训练和保存模型文件之后,是否有一种方法只选择模型中的某些tensorflow.keras.Layer实例?
发布于 2022-10-26 07:23:46
好吧,你可以去Tensorflow的这一页,在那里他们必须说出具体层量化的确切步骤.
https://www.tensorflow.org/model_optimization/guide/quantization/training_comprehensive_guide.md
但这里有个例子..。
# Create a base model
base_model = setup_model()
base_model.load_weights(pretrained_weights) # optional but recommended for model accuracy
# Helper function uses `quantize_annotate_layer` to annotate that only the
# Dense layers should be quantized.
def apply_quantization_to_dense(layer):
if isinstance(layer, tf.keras.layers.Dense):
return tfmot.quantization.keras.quantize_annotate_layer(layer)
return layer
# Use `tf.keras.models.clone_model` to apply `apply_quantization_to_dense`
# to the layers of the model.
annotated_model = tf.keras.models.clone_model(
base_model,
clone_function=apply_quantization_to_dense,
)
# Now that the Dense layers are annotated,
# `quantize_apply` actually makes the model quantization aware.
quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)
quant_aware_model.summary()一个实用的例子..。
# Use `quantize_annotate_layer` to annotate that the `Dense` layer
# should be quantized.
i = tf.keras.Input(shape=(20,))
x = tfmot.quantization.keras.quantize_annotate_layer(tf.keras.layers.Dense(10))(i)
o = tf.keras.layers.Flatten()(x)
annotated_model = tf.keras.Model(inputs=i, outputs=o)
# Use `quantize_apply` to actually make the model quantization aware.
quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)
# For deployment purposes, the tool adds `QuantizeLayer` after `InputLayer` so that the
# quantized model can take in float inputs instead of only uint8.
quant_aware_model.summary()https://stackoverflow.com/questions/74055856
复制相似问题