文章/答案/技术大牛

发布

社区首页 >问答首页 >将一个数字tensorflow数据集改编为一个文本电视程序

问将一个数字tensorflow数据集改编为一个文本电视程序
EN

Stack Overflow用户

提问于 2022-01-17 18:46:08

回答 1查看 86关注 0票数 1

考虑以下代码：

import numpy as np
import tensorflow as tf

simple_data_samples = np.array([
         [1, 1, 1, -1, -1],
         [2, 2, 2, -2, -2],
         [3, 3, 3, -3, -3],
         [4, 4, 4, -4, -4],
         [5, 5, 5, -5, -5],
         [6, 6, 6, -6, -6],
         [7, 7, 7, -7, -7],
         [8, 8, 8, -8, -8],
         [9, 9, 9, -9, -9],
         [10, 10, 10, -10, -10],
         [11, 11, 11, -11, -11],
         [12, 12, 12, -12, -12],
])

def timeseries_dataset_multistep_combined(features, label_slice, input_sequence_length, output_sequence_length, batch_size):
    feature_ds = tf.keras.preprocessing.timeseries_dataset_from_array(features, None, input_sequence_length + output_sequence_length, batch_size=batch_size)

    def split_feature_label(x):
        x=tf.strings.as_string(x)

        return x[:, :input_sequence_length, :], x[:, input_sequence_length:, label_slice]

    feature_ds = feature_ds.map(split_feature_label)

    return feature_ds

ds = timeseries_dataset_multistep_combined(simple_data_samples, slice(None, None, None), input_sequence_length=4, output_sequence_length=2,
batch_size=1)
def print_dataset(ds):
    for inputs, targets in ds:
        print("---Batch---")
        print("Feature:", inputs.numpy())
        print("Label:", targets.numpy())
        print("")



print_dataset(ds)

tensorflow数据集"ds“由输入和目标组成。我希望将输入和目标调整为文本电视。下面的假设代码显示了我想要实现的目标：

input_vectorization = layers.TextVectorization(
    max_tokens=20,
    output_mode="int",
    output_sequence_length=6,
)
target_vectorization = layers.TextVectorization(
    max_tokens=20,
    output_mode="int",
    output_sequence_length=6 + 1
)

input_vectorization.adapt(ds.input)
target_vectorization.adapt(ds.target)

是否知道如何使用上述示例对其进行编码？

python

tensorflow

tensorflow2.0

tensorflow-datasets

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-01-17 20:00:30

如果我正确地理解了您，您可以在TextVectorization层中使用您现有的数据集，如下所示：

import tensorflow as tf

input_vectorization = tf.keras.layers.TextVectorization(
    max_tokens=20,
    output_mode="int",
    output_sequence_length=6,
)
target_vectorization = tf.keras.layers.TextVectorization(
    max_tokens=20,
    output_mode="int",
    output_sequence_length=6 + 1
)

# Get inputs only and flatten them
inputs = ds.map(lambda x, y: tf.reshape(x, (tf.math.reduce_prod(tf.shape(x)), )))

# Get targets only and flatten them
targets = ds.map(lambda x, y: tf.reshape(y, (tf.math.reduce_prod(tf.shape(y)), )))

input_vectorization.adapt(inputs)
target_vectorization.adapt(targets)
print(input_vectorization.get_vocabulary())
print(target_vectorization.get_vocabulary())

['', '[UNK]', '7', '6', '5', '4', '8', '3', '9', '2', '10', '1']
['', '[UNK]', '9', '8', '7', '6', '11', '10', '5', '12']

注意，adapt函数只是根据输入创建一个词汇表，词汇表中的每个单词都映射到一个唯一的整数值。此外，由于standardize='lower_and_strip_punctuation'层的默认参数TextVectorization，在调用adapt时移除减号。如果需要，可以通过设置例如standardize='lower'来避免这种行为。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70746137

复制

相似问题

问将一个数字tensorflow数据集改编为一个文本电视程序
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将一个数字tensorflow数据集改编为一个文本电视程序EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将一个数字tensorflow数据集改编为一个文本电视程序
EN