首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Kedro模板配置不加载/解析变量

Kedro模板配置不加载/解析变量
EN

Stack Overflow用户
提问于 2022-07-25 07:37:10
回答 1查看 153关注 0票数 0

跟进this的问题。我使用的是Kedro v0.18.2。我正在尝试使用TemplateConfig,因此我在conf/base下创建了一个globals.yml,如下所示:

代码语言:javascript
复制
paths:
    base_path: s3://my_project

datasets:
    pdf: base.PDFDataSet
    png: pillow.ImageDataSet
    csv: pandas.CSVDataSet
    excel: pandas.ExcelDataSet

data_folders:
    raw: 01_raw
    intermediate: 02_intermediate
    primary: 03_primary
    feature: 04_feature
    model_input: 05_model_input
    models: 06_models
    model_output: 07_model_output
    reporting: 08_reporting

我已经跟踪了这些文档,并且取消了一些settings.py的注释:

代码语言:javascript
复制
"""Project settings. There is no need to edit this file unless you want to change values
from the Kedro defaults. For further information, including these default values, see
https://kedro.readthedocs.io/en/stable/kedro_project_setup/settings.html."""

# Instantiated project hooks.
# from certifai.hooks import ProjectHooks
# HOOKS = (ProjectHooks(),)

# Installed plugins for which to disable hook auto-registration.
# DISABLE_HOOKS_FOR_PLUGINS = ("kedro-viz",)

# Class that manages storing KedroSession data.
# from kedro.framework.session.store import ShelveStore
# SESSION_STORE_CLASS = ShelveStore
# Keyword arguments to pass to the `SESSION_STORE_CLASS` constructor.
# SESSION_STORE_ARGS = {
#     "path": "./sessions"
# }

# Class that manages Kedro's library components.
# from kedro.framework.context import KedroContext
# CONTEXT_CLASS = KedroContext

# Directory that holds configuration.
# CONF_SOURCE = "conf"

# Class that manages how configuration is loaded.
from kedro.config import TemplatedConfigLoader
CONFIG_LOADER_CLASS = TemplatedConfigLoader
CONFIG_LOADER_ARGS = {
    "globals_pattern": "*globals.yml",
}

# Class that manages the Data Catalog.
# from kedro.io import DataCatalog
# DATA_CATALOG_CLASS = DataCatalog

catalog.yml看起来是这样的:

代码语言:javascript
复制
_label_images: &label_images
  type: PartitionedDataSet
  path: ${paths.base_path}/data/${data_folders.raw}/label_images
  dataset: ${datasets.png}

label_images_png:
  <<: *label_images
  filename_suffix: .png

label_images_jpg:
  <<: *label_images
  filename_suffix: .jpg

label_images_jpeg:
  <<: *label_images
  filename_suffix: .jpeg

label_images_pdf:
  <<: *label_images
  dataset: base.PDFDataSet
  filename_suffix: .pdf

my_project_label_extracts:
  type: PartitionedDataSet
  path: s3://my_project/data/01_raw/label_extracts
  dataset: pandas.ExcelDataSet

我的测试脚本如下所示:

代码语言:javascript
复制
from kedro.config import ConfigLoader
from kedro.framework.project import settings
from pathlib import Path
from kedro.extras.datasets import pillow

project_path = Path(__file__).parent.parent.parent

conf_path = str(project_path / settings.CONF_SOURCE)
conf_loader = ConfigLoader(conf_source=conf_path, env="base")
conf_catalog = conf_loader.get("catalog*", "catalog*/**")

images_dataset = pillow.ImageDataSet.from_config("label_images_png", conf_catalog["label_images_png"])
images_loader = images_dataset.load()
images_loader["00337180800086"]().show()

catalog.yml中使用硬编码的值,脚本运行并输出图像,但是,使用模板配置它不起作用。我是不是遗漏了什么?

如果问题被重复了,很抱歉。

EN

回答 1

Stack Overflow用户

发布于 2022-07-25 11:18:36

我注意到的第一个bug是在条目的目录中:

代码语言:javascript
复制
_label_images: &label_images
  type: PartitionedDataSet
  path: ${paths.base_path}/data/${data_folders.raw}/label_images
  dataset: ${datasets.png}

您错过了数据集的类型键。正确的条目应该是:

代码语言:javascript
复制
_label_images: &label_images
  type: PartitionedDataSet
  path: ${paths.base_path}/data/${data_folders.raw}/label_images
  dataset:
    type: ${datasets.png}

如果您现在使用TemplatedConfigLoader运行脚本,那么您应该希望不再收到所提到的错误了:

代码语言:javascript
复制
from kedro.config import ConfigLoader, TemplatedConfigLoader
from kedro.framework.project import settings
from pathlib import Path
from kedro.extras.datasets import pillow

project_path = Path(__file__).parent.parent.parent

conf_path = str(project_path / settings.CONF_SOURCE)
conf_loader = TemplatedConfigLoader(conf_source=conf_path, env="base", globals_pattern="*globals.yml")
conf_catalog = conf_loader.get("catalog*", "catalog*/**")

images_dataset = pillow.ImageDataSet.from_config("label_images_png", conf_catalog["label_images_png"])
images_loader = images_dataset.load()
images_loader["00337180800086"]().show()

为了便于沟通,您可能需要加入Kedro不和谐频道,这样我们就可以实时地对您作出响应:https://discord.gg/akJDeVaxnB

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73105524

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档