我有一个保存的模型,我对一个小的文本(消息)数据语料库进行了训练,并且我试图使用相同的模型来预测另一个语料库上的积极或消极情绪(即二进制分类)。我基于GOOGLE指南建立了NLP模型,您可以在这里查看它(如果您认为有用的话--我使用了选项A)。
我不断地得到一个输入形状错误,我知道这个错误意味着我必须重塑输入以适应预期的形状。然而,我想预测的数据并不是这么大。错误声明是:
ValueError: Error when checking input: expected dropout_8_input to have shape (519,) but got array with shape (184,)模型期望形状(519,)的原因是因为在训练过程中,输入到第一个辍学层(以TfidfVectorized形式)的语料库是print(x_train.shape) #(454, 519)。
我对ML并不陌生,但我认为,在优化模型之后,我试图预测的所有数据都应该与用于训练模型的数据的形状相同。有没有经历过类似的问题?在如何训练模型以便可以预测不同大小的输入方面,我是否遗漏了什么?或者,我是否误解了如何使用模型来进行类预测?
我是以下列职能为基础进行模特培训的:
from tensorflow.python.keras import models
from tensorflow.python.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.python.keras.layers import Convolution2D, MaxPooling2D
def mlp_model(layers, units, dropout_rate, input_shape, num_classes):
"""Creates an instance of a multi-layer perceptron model.
# Arguments
layers: int, number of `Dense` layers in the model.
units: int, output dimension of the layers.
dropout_rate: float, percentage of input to drop at Dropout layers.
input_shape: tuple, shape of input to the model.
num_classes: int, number of output classes.
# Returns
An MLP model instance.
"""
op_units, op_activation = _get_last_layer_units_and_activation(num_classes)
model = models.Sequential()
model.add(Dropout(rate=dropout_rate, input_shape=input_shape))
# print(input_shape)
for _ in range(layers-1):
model.add(Dense(units=units, activation='relu'))
model.add(Dropout(rate=dropout_rate))
model.add(Dense(units=op_units, activation=op_activation))
return mode
def train_ngram_model(data,
learning_rate=1e-3,
epochs=1000,
batch_size=128,
layers=2,
units=64,
dropout_rate=0.2):
"""Trains n-gram model on the given dataset.
# Arguments
data: tuples of training and test texts and labels.
learning_rate: float, learning rate for training model.
epochs: int, number of epochs.
batch_size: int, number of samples per batch.
layers: int, number of `Dense` layers in the model.
units: int, output dimension of Dense layers in the model.
dropout_rate: float: percentage of input to drop at Dropout layers.
# Raises
ValueError: If validation data has label values which were not seen
in the training data.
# Reference
For tuning hyperparameters, please visit the following page for
further explanation of each argument:
https://developers.google.com/machine-learning/guides/text-classification/step-5
"""
# Get the data.
(train_texts, train_labels), (val_texts, val_labels) = data
# Verify that validation labels are in the same range as training labels.
num_classes = get_num_classes(train_labels)
unexpected_labels = [v for v in val_labels if v not in range(num_classes)]
if len(unexpected_labels):
raise ValueError('Unexpected label values found in the validation set:'
' {unexpected_labels}. Please make sure that the '
'labels in the validation set are in the same range '
'as training labels.'.format(
unexpected_labels=unexpected_labels))
# Vectorize texts.
x_train, x_val = ngram_vectorize(
train_texts, train_labels, val_texts)
# Create model instance.
model = mlp_model(layers=layers,
units=units,
dropout_rate=dropout_rate,
input_shape=x_train.shape[1:],
num_classes=num_classes)
# num_classes determine which activation fn to use
# Compile model with learning parameters.
if num_classes == 2:
loss = 'binary_crossentropy'
else:
loss = 'sparse_categorical_crossentropy'
optimizer = tf.keras.optimizers.Adam(lr=learning_rate)
model.compile(optimizer=optimizer, loss=loss, metrics=['acc'])
# Create callback for early stopping on validation loss. If the loss does
# not decrease in two consecutive tries, stop training.
callbacks = [tf.keras.callbacks.EarlyStopping(
monitor='val_loss', patience=2)]
# Train and validate model.
history = model.fit(
x_train,
train_labels,
epochs=epochs,
callbacks=callbacks,
validation_data=(x_val, val_labels),
verbose=2, # Logs once per epoch.
batch_size=batch_size)
# Print results.
history = history.history
print('Validation accuracy: {acc}, loss: {loss}'.format(
acc=history['val_acc'][-1], loss=history['val_loss'][-1]))
# Save model.
model.save('MCTR2.h5')
return history['val_acc'][-1], history['val_loss'][-1]由此,我得出模型的体系结构如下:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dropout (Dropout) (None, 519) 0
_________________________________________________________________
dense (Dense) (None, 64) 33280
_________________________________________________________________
dropout_1 (Dropout) (None, 64) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 65
=================================================================
Total params: 33,345
Trainable params: 33,345
Non-trainable params: 0
_________________________________________________________________发布于 2019-11-15 17:24:52
要使维度在tensorflow中是可变的,则需要将它们指定为None。
第一个维度是batch_size,这就是为什么通常总是None,但通常一批序列数据将具有形状(batch_size, sequence_length, num_features)。因此,单个序列通常是2D的,长度是可变的,但是每个“令牌”的特征数是固定的。
看起来你是在给你的模型一维矢量,Dense层有固定的输入形状。如果您想要对可变长度序列建模,则必须使用能够满足此要求的层(例如卷积、LSTM)来构建模型。
https://stackoverflow.com/questions/58881719
复制相似问题