首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >管道在kedro中找不到节点

管道在kedro中找不到节点
EN

Stack Overflow用户
提问于 2020-02-23 02:11:42
回答 2查看 1.4K关注 0票数 4

我跟随pipelines tutorial,创建了所有需要的文件,用kedro run --node=preprocessing_data启动了kedro,但遇到了这样的错误消息:

代码语言:javascript
复制
ValueError: Pipeline does not contain nodes named ['preprocessing_data'].

如果我不带node参数运行kedro,我会收到

代码语言:javascript
复制
kedro.context.context.KedroContextError: Pipeline contains no nodes

文件的内容:

代码语言:javascript
复制
src/project/pipelines/data_engineering/nodes.py
def preprocess_data(data: SparkDataSet) -> None:
    print(data)
    return
代码语言:javascript
复制
src/project/pipelines/data_engineering/pipeline.py
def create_pipeline(**kwargs):
    return Pipeline(
        [
            node(
                func=preprocess_data,
                inputs="data",
                outputs="preprocessed_data",
                name="preprocessing_data",
            ),
        ]
    )
代码语言:javascript
复制
src/project/pipeline.py
def create_pipelines(**kwargs) -> Dict[str, Pipeline]:
    de_pipeline = de.create_pipeline()
    return {
        "de": de_pipeline,
        "__default__": Pipeline([])
    }
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-02-23 11:14:41

我认为看起来你需要在__default__中建立管道。例如:

代码语言:javascript
复制
def create_pipelines(**kwargs) -> Dict[str, Pipeline]:
    de_pipeline = de.create_pipeline()
    return {
        "de": data_engineering_pipeline,
        "__default__": data_engineering_pipeline
    }

kedro run --node=preprocessing_data就为我工作了。

票数 6
EN

Stack Overflow用户

发布于 2020-02-26 03:24:24

Mayurc是正确的,因为你的__default__管道是空的,所以没有节点。另一种选择是只使用cli运行de管道。

代码语言:javascript
复制
kedro run --pipeline de

您可以在run命令的帮助文本中找到此选项和更多信息。

代码语言:javascript
复制
$ kedro run --help

Usage: kedro run [OPTIONS]

  Run the pipeline.

Options:
  --from-inputs TEXT        A list of dataset names which should be used as a
                            starting point.
  --from-nodes TEXT         A list of node names which should be used as a
                            starting point.
  --to-nodes TEXT           A list of node names which should be used as an
                            end point.
  -n, --node TEXT           Run only nodes with specified names.
  -r, --runner TEXT         Specify a runner that you want to run the pipeline
                            with.
                            This option cannot be used together with
                            --parallel.
  -p, --parallel            Run the pipeline using the `ParallelRunner`.
                            If
                            not specified, use the `SequentialRunner`. This
                            flag cannot be used together
                            with --runner.
  -e, --env TEXT            Run the pipeline in a configured environment. If
                            not specified,
                            pipeline will run using environment
                            `local`.
  -t, --tag TEXT            Construct the pipeline using only nodes which have
                            this tag
                            attached. Option can be used multiple
                            times, what results in a
                            pipeline constructed from
                            nodes having any of those tags.
  -lv, --load-version TEXT  Specify a particular dataset version (timestamp)
                            for loading.
  --pipeline TEXT           Name of the modular pipeline to run.
                            If not set,
                            the project pipeline is run by default.
  -c, --config FILE         Specify a YAML configuration file to load the run
                            command arguments from. If command line arguments
                            are provided, they will
                            override the loaded ones.
  --params TEXT             Specify extra parameters that you want to pass
                            to
                            the context initializer. Items must be separated
                            by comma, keys - by colon,
                            example:
                            param1:value1,param2:value2. Each parameter is
                            split by the first comma,
                            so parameter values are
                            allowed to contain colons, parameter keys are not.
  -h, --help                Show this message and exit.

发布了第二个答案,因为完整的帮助输出不适合在注释中。

票数 4
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60355240

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档