首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >形状在tf.data.Dataset.from_tensor_slices的最后一条记录中不兼容

形状在tf.data.Dataset.from_tensor_slices的最后一条记录中不兼容
EN

Stack Overflow用户
提问于 2020-06-16 05:20:22
回答 1查看 146关注 0票数 1

我已经在TensorFlow2.0中实现了seq2seq翻译模型

但是在训练过程中,我得到了以下错误:

代码语言:javascript
复制
ValueError: Shapes (2056, 10, 10000) and (1776, 10, 10000) are incompatible

我的数据集中有10000条记录。从第一条记录开始直到8224条记录维度匹配。但是对于最后的1776条记录,我得到了上面提到的错误,只是因为我的batch_size大于剩余的记录数。下面是我的代码:

代码语言:javascript
复制
max_seq_len_output = 10
n_words = 10000
batch_size = 2056

model = Model_translation(batch_size = batch_size,embed_size = embed_size,total_words = n_words , dropout_rate = dropout_rate,num_classes = n_words,embedding_matrix = embedding_matrix)
dataset_train = tf.data.Dataset.from_tensor_slices((encoder_input,decoder_input,decoder_output))
dataset_train = dataset_train.shuffle(buffer_size = 1024).batch(batch_size)


loss_object = tf.keras.losses.CategoricalCrossentropy()#used in backprop
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)

train_loss = tf.keras.metrics.Mean(name='train_loss')#mean of the losses per observation
train_accuracy =tf.keras.metrics.CategoricalAccuracy(name='train_accuracy')


##### no @tf.function here 
def training(X_1,X_2,y):
    #creation of one-hot-encoding, because if I would do it out of the loop if would have RAM problem
    y_numpy = y.numpy()
    Y = np.zeros((batch_size,max_seq_len_output,n_words),dtype='float32')
    for i, d in enumerate(y_numpy):
        for t, word in enumerate(d):
            if word != 0:
                Y[i, t, word] = 1

    Y = tf.convert_to_tensor(Y)
    #predictions
    with tf.GradientTape() as tape:#Trainable variables (created by tf.Variable or tf.compat.v1.get_variable, where trainable=True is default in both cases) are automatically watched. 
        predictions =  model(X_1,X_2)
        loss = loss_object(Y,predictions)

    gradients = tape.gradient(loss,model.trainable_variables)
    optimizer.apply_gradients(zip(gradients,model.trainable_variables))
    train_loss(loss) 
    train_accuracy(Y,predictions)
    del Y
    del y_numpy


EPOCHS = 70

for epoch in range(EPOCHS):
    for X_1,X_2,y in dataset_train:
       training(X_1,X_2,y)
    template = 'Epoch {}, Loss: {}, Accuracy: {}'
    print(template.format(epoch+1,train_loss.result(),train_accuracy.result()*100))
    # Reset the metrics for the next epoch
    train_loss.reset_states()
    train_accuracy.reset_states() 

我如何解决这个问题?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-06-16 06:06:59

一种解决方案是在批处理过程中删除剩余的

代码语言:javascript
复制
dataset_train = dataset_train.shuffle(buffer_size = 1024).batch(batch_size, drop_remainder=True)
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62397236

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档