文章/答案/技术大牛

发布

社区首页 >问答首页 >读取jupyter笔记本中的顶点ai数据集

问读取jupyter笔记本中的顶点ai数据集
EN

Stack Overflow用户

提问于 2021-08-31 11:43:47

回答 2查看 847关注 0票数 3

我正在尝试创建一个python实用程序，它将从顶点ai数据集获取数据集，并为该数据集生成统计信息。但我无法使用jupyter笔记本查看数据集。有什么办法吗？

google-cloud-platform

google-cloud-vertex-ai

python

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-09-14 15:54:33

如果我正确理解，您希望在Jupyter Notebook中使用Jupyter Notebook数据集。我认为目前这是不可能的。您可以以JSONL格式将Vertex AI数据集导出为Google Cloud Storage：

您的数据集将作为JSONL格式的文本项列表导出。每行包含一个云存储路径、分配给该项的任何标签以及指示该项是否在培训、验证或测试集中的标志。

此时，您可以使用BigQuery数据在Notebook中使用%%bigquery，就像在木星笔记本中可视化BigQuery数据。中提到的那样，或者像在如何在Google平台jupyter笔记本中读取csv文件线程中显示的那样使用机器目录中的csv_read()或GCS。

但是，您可以在Feature Request中填充谷歌问题跟踪器，以添加直接在Jupyter Notebook中使用VertexAI数据集的可能性，这将由Google Vertex AI Team考虑。

票数 0

Stack Overflow用户

发布于 2022-03-22 14:24:36

如果我错了，请纠正我，您是否试图将gcp项目中的顶点ai数据集访问到jupyter笔记本中？如果是，请试试下面的代码，看看是否可以访问dataset。

def list_datasets(project_id, compute_region, filter=None):
"""List all datasets."""
result = []
# [START automl_tables_list_datasets]
# TODO(developer): Uncomment and set the following variables
# project_id = 'PROJECT_ID_HERE'
# compute_region = 'COMPUTE_REGION_HERE'
# filter = 'filter expression here'

from google.cloud import automl_v1beta1 as automl

client = automl.TablesClient(project=project_id, region=compute_region)
print('client:',client)
# List all the datasets available in the region by applying filter.
response = client.list_datasets(filter=filter)

print("List of datasets:")
for dataset in response:
    # Display the dataset information.
    print("Dataset name: {}".format(dataset.name))
    print("Dataset id: {}".format(dataset.name.split("/")[-1]))
    print("Dataset display name: {}".format(dataset.display_name))
    metadata = dataset.tables_dataset_metadata
    print(
        "Dataset primary table spec id: {}".format(
            metadata.primary_table_spec_id
        )
    )
    print(
        "Dataset target column spec id: {}".format(
            metadata.target_column_spec_id
        )
    )
    print(
        "Dataset target column spec id: {}".format(
            metadata.target_column_spec_id
        )
    )
    print(
        "Dataset weight column spec id: {}".format(
            metadata.weight_column_spec_id
        )
    )
    print(
        "Dataset ml use column spec id: {}".format(
            metadata.ml_use_column_spec_id
        )
    )
    print("Dataset example count: {}".format(dataset.example_count))
    print("Dataset create time: {}".format(dataset.create_time))
    print("\n")

    # [END automl_tables_list_datasets]
    result.append(dataset)

return result

在调用此函数时，需要传递project_id和comupte_region。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/68998065

复制

相似问题

问读取jupyter笔记本中的顶点ai数据集
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问读取jupyter笔记本中的顶点ai数据集EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问读取jupyter笔记本中的顶点ai数据集
EN