文章/答案/技术大牛

发布

社区首页 >问答首页 >使用s3fs下载文件

问使用s3fs下载文件
EN

Stack Overflow用户

提问于 2020-07-21 23:13:51

回答 2查看 3.5K关注 0票数 5

我正在尝试使用s3fs库从s3存储桶下载csv文件。我注意到，使用pandas编写新的csv在某种程度上改变了数据。所以我想直接下载原始状态的文件。

documentation有一个下载功能，但我不知道如何使用它：

download(self, rpath, lpath[, recursive])：Alias of FilesystemSpec.get.

这是我尝试过的：

import pandas as pd
import datetime
import os
import s3fs
import numpy as np

#Creds for s3
fs = s3fs.S3FileSystem(key=mykey, secret=mysecretkey)
bucket = "s3://mys3bucket/mys3bucket"
files = fs.ls(bucket)[-3:]


#download files:
for file in files:
    with fs.open(file) as f:
        fs.download(f,"test.csv")

AttributeError: 'S3File' object has no attribute 'rstrip'

python

amazon-s3

python-s3fs

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-07-21 23:58:32

for file in files:
    fs.download(file,'test.csv')

修改为下载目录下的所有文件：

import pandas as pd
import datetime
import os
import s3fs
import numpy as np

#Creds for s3
fs = s3fs.S3FileSystem(key=mykey, secret=mysecretkey)
bucket = "s3://mys3bucket/mys3bucket"

#files references the entire bucket.
files = fs.ls(bucket)

for file in files:
    fs.download(file,'test.csv')

票数 5

Stack Overflow用户

发布于 2020-07-22 02:28:59

我也会在这里复制我的答案，因为我在更一般的情况下使用了这一点：

# Access Pando
import s3fs
#Blocked out url as "enter url here" for security reasons
fs = s3fs.S3FileSystem(anon=True, client_kwargs={'endpoint_url':"enter url here"})

# List objects in a path and import to array
# -3 limits output for testing purposes to prevent memory overload
files = fs.ls('hrrr/sfc/20190101')[-3:]

#Make a staging directory that can hold data as a medium
os.mkdir("Staging")

#Copy files into that directory (specific directory structure requires splitting strings)
for file in files:
    item = str(file)
    lst = item.split("/")
    name = lst[3]
    path = "Staging\\" + name
    print(path)
    fs.download(file, path)

请注意，对于这个特定的python包，文档相当贫乏。我在这里(https://readthedocs.org/projects/s3fs/downloads/pdf/latest/)找到了一些关于s3fs采用什么参数的文档。完整的参数列表快结束了，尽管它们没有指定参数的含义。以下是s3fs.download的一般指南：

-arg1 (rpath)是从中获取文件的源目录。与上面两个答案一样，获得此结果的最佳方法是在s3存储桶上执行fs.ls，并将其保存到变量中

-arg2 (lpath)是目标目录和文件名。请注意，如果没有有效的输出文件，这将返回属性Error OP got。我将其定义为路径变量

-arg3是一个可选参数，用于选择递归执行下载

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/63017653

复制

相似问题

问使用s3fs下载文件
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用s3fs下载文件EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用s3fs下载文件
EN