我在深度学习和图像分类方面有点新。我想使用VGG16从图像中提取特征,并将它们作为vit模型的输入。以下是我的代码:
from tensorflow.keras.applications.vgg16 import VGG16
vgg_model = VGG16(include_top=False, weights = 'imagenet', input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3))
for layer in vgg_model.layers:
layer.trainable = False
from vit_keras import vit
vit_model = vit.vit_b16(
image_size = IMAGE_SIZE,
activation = 'sigmoid',
pretrained = True,
include_top = False,
pretrained_top = False,
classes = 2)
model = tf.keras.Sequential([
vgg_model,
vit_model,
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation = tfa.activations.gelu),
tf.keras.layers.Dense(256, activation = tfa.activations.gelu),
tf.keras.layers.Dense(64, activation = tfa.activations.gelu),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dense(1, 'sigmoid')
],
name = 'vision_transformer')
model.summary()但是,我得到了以下错误:
ValueError:层嵌入的输入0与层不兼容:输入形状的预期轴-1为值3,但接收到的输入为形状(无,8,8,512)
我假设这个错误发生在VGG16和vit的合并中。如何纠正这种情况下的错误?
发布于 2022-03-02 15:57:39
您不能将VGG16模型的输出提供给vit_model,因为这两个模型都期望输入形状(224, 224, 3)或您定义的某个形状。问题是VGG16模型具有输出形状(8, 8, 512)。您可以尝试重采样/重塑/调整输出以适应预期的形状,但我不推荐它。相反,只需向两个模型提供相同的输入,然后将它们的结果连接起来。下面是一个有用的例子:
import tensorflow as tf
import tensorflow_addons as tfa
from vit_keras import vit
IMAGE_SIZE = 224
vgg_model = tf.keras.applications.vgg16.VGG16(include_top=False, weights = 'imagenet', input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3))
for layer in vgg_model.layers:
layer.trainable = False
vit_model = vit.vit_b16(
image_size = IMAGE_SIZE,
activation = 'sigmoid',
pretrained = True,
include_top = False,
pretrained_top = False,
classes = 2)
inputs = tf.keras.layers.Input((IMAGE_SIZE, IMAGE_SIZE, 3))
vgg_output = tf.keras.layers.Flatten()(vgg_model(inputs))
vit_output = vit_model(inputs)
x = tf.keras.layers.Concatenate(axis=-1)([vgg_output, vit_output])
x = tf.keras.layers.Dense(512, activation = tfa.activations.gelu)(x)
x = tf.keras.layers.Dense(256, activation = tfa.activations.gelu)(x)
x = tf.keras.layers.Dense(64, activation = tfa.activations.gelu)(x)
x = tf.keras.layers.BatchNormalization()(x)
outputs = tf.keras.layers.Dense(1, 'sigmoid')(x)
model = tf.keras.Model(inputs, outputs)
print(model.summary())https://stackoverflow.com/questions/71324609
复制相似问题