我试图列出这样的s3服从命令:
for key in s3_client.list_objects(Bucket='bucketname')['Contents']:
logger.debug(key['Key'])我只想打印第一层上存在的文件夹名或文件名。
例如,如果我的桶中有以下内容:
bucketname
folder1
folder2
text1.txt
text2.txt
catallog.json我只想打印folder1,folder2和catalog.json。我不想包括text1.txt等等。
但是,我的当前解决方案也会打印文件夹中的文件名。
我怎么能修改这个?我看到有一个‘前缀’参数,但不确定如何使用它。
发布于 2022-04-01 14:57:28
您可以拆分"/“上的键,并且只保留第一个级别:
level1 = set() #Using a set removes duplicates automatically
for key in s3_client.list_objects(Bucket='bucketname')['Contents']:
level1.add(key["Key"].split("/")[0]) #Here we only keep the first level of the key
#then print your level1 set
logger.debug(level1)/!警告
list_object方法已经修订,建议根据AWS S3文档使用list_objects_v2。continuation_token:level1 = set()
continuation_token = ""
while continuation_token is not None:
extra_params = {"ContinuationToken": continuation_token} if continuation_token else {}
response = s3_client.list_objects_v2(Bucket="bucketname", Prefix="", **extra_params)
continuation_token = response.get("NextContinuationToken")
for obj in response.get("Contents", []):
level1.add(obj.get("Key").split("/")[0])
logger.debug(level1)发布于 2022-04-01 15:15:08
您可以使用Delimiter选项,例如:
import boto3
s3 = boto3.client("s3")
BUCKET = "bucketname"
rsp = s3.list_objects_v2(Bucket=BUCKET, Delimiter="/")
objects = [obj["Key"] for obj in rsp["Contents"]]
folders = [fld["Prefix"] for fld in rsp["CommonPrefixes"]]
for obj in objects:
print("Object:", obj)
for folder in folders:
print("Folder:", folder)结果:
Object: catalog.json
Folder: folder1/
Folder: folder2/请注意,如果在顶层(超过1000个)有大量的键,则需要分页请求。
https://stackoverflow.com/questions/71708707
复制相似问题