文章/答案/技术大牛

发布

社区首页 >问答首页 >ai-platform:在使用估值器运行TensorFlow 2.1培训作业时，在输出中不存在eval文件夹或导出文件夹

问ai-platform:在使用估值器运行TensorFlow 2.1培训作业时，在输出中不存在eval文件夹或导出文件夹
EN

Stack Overflow用户

提问于 2020-06-12 03:31:03

回答 1查看 307关注 0票数 3

问题

我的代码在本地工作，但在升级到TensorFlow 2.1之后提交在线培训工作时，我无法从TensorFlow估值器获得任何评估数据或导出。下面是我的大部分代码：

def build_estimator(model_dir, config):

    return tf.estimator.LinearClassifier(
        feature_columns=feature_columns,
        n_classes=2,
        optimizer=tf.keras.optimizers.Ftrl(
            learning_rate=args.learning_rate,
            l1_regularization_strength=args.l1_strength
        ),
        model_dir=model_dir,
        config=config
    )

run_config = tf.estimator.RunConfig(save_checkpoints_steps=100,
                                    save_summary_steps=100)  
...

estimator = build_estimator(model_dir=args.job_dir, config=run_config)

...

def serving_input_fn():
    inputs = {
        'feature1': tf.compat.v1.placeholder(shape=None, dtype=tf.string),
        'feature2': tf.compat.v1.placeholder(shape=None, dtype=tf.string),
        'feature3': tf.compat.v1.placeholder(shape=None, dtype=tf.string),
        ...
    }

    split_features = {}

    for feature in inputs:
        split_features[feature] = tf.strings.split(inputs[feature], "||").to_sparse()

    return tf.estimator.export.ServingInputReceiver(features=split_features, receiver_tensors=inputs)

exporter_cls = tf.estimator.LatestExporter('predict', serving_input_fn)

eval_spec = tf.estimator.EvalSpec(
    input_fn=lambda: input_eval_fn(args.test_dir),
    exporters=[exporter_cls],
    start_delay_secs=10,
    throttle_secs=0)

tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

如果我使用本地gcloud命令运行它可以正常工作，我将得到我的/eval和/export文件夹：

gcloud ai-platform local train \
--package-path trainer \
--module-name trainer.task \
-- \
--train-dir $TRAIN_DATA \
--test-dir $TEST_DATA \
--training-steps $TRAINING_STEPS \
--job-dir $OUTPUT

但是，当我试图在云中运行它时，我不会得到我的/eval /export文件夹。只有升级到2.1时才会发生这种情况。以前，一切都很好，在1.14。

    gcloud ai-platform jobs submit training $JOB_NAME \
    --job-dir $OUTPUT_PATH \
    --staging-bucket gs://$STAGING_BUCKET_NAME \
    --runtime-version 2.1 \
    --python-version 3.7 \
    --package-path trainer/ \
    --module-name trainer.task \
    --region $REGION \
    --config config.yaml \
    -- \
    --train-dir $TRAIN_DATA \
    --test-dir $TEST_DATA \

我试过什么

我并没有依赖EvalSpec来导出我的模型，而是尝试使用tf.estimator.export_saved_model。虽然这在本地和网上都有效，但如果可能的话，我想继续使用EvalSpec和train_and_evaluate，因为我可以传递不同的导出方法，比如BestExporter、LastExporter等等。

我的主要问题是.

我是错误地导出了TensorFlow 2.1中的模型，还是在新版本的平台上出现了错误？

tensorflow

google-cloud-ml

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-06-12 22:45:15

找到了答案

基于TF_CONFIG环境变量的文档.

母版是TensorFlow中不推荐的任务类型。master代表了一项任务，该任务在某些配置中扮演着类似的角色，但同时也充当了评估人员。TensorFlow 2不支持包含主任务的TF_CONFIG环境变量。

所以以前我们用的是TF 1.X，它用的是一位主工人。但是，在训练TF 2.X作业时，大师已经被废弃了。现在默认是首席，但默认情况下，首席并不充当评估者。为了获得评估数据，我们需要更新配置yaml以显式分配评估器。

https://cloud.google.com/ai-platform/training/docs/distributed-training-details#tf-config-format

我们用config.yaml和evaluatorType更新了我们的evaluatorCount

trainingInput:
  scaleTier: CUSTOM
  masterType: standard_gpu
  workerType: standard_gpu
  workerCount: 1
  evaluatorType: standard_gpu
  evaluatorCount: 1

成功了！

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/62337037

复制

相似问题

问ai-platform:在使用估值器运行TensorFlow 2.1培训作业时，在输出中不存在eval文件夹或导出文件夹
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问ai-platform:在使用估值器运行TensorFlow 2.1培训作业时，在输出中不存在eval文件夹或导出文件夹EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问ai-platform:在使用估值器运行TensorFlow 2.1培训作业时，在输出中不存在eval文件夹或导出文件夹
EN