我在尝试解压zip文件时遇到了这个问题。
-- zipfile.is_zipfile(my_file)总是返回False,即使UNIX命令unzip可以很好地处理它。此外,在尝试执行zipfile.ZipFile(path/file_handle_to_path)时,我也遇到了同样的错误
-- file命令返回Zip archive data, at least v2.0 to extract,并对它显示的文件使用less:
PKZIP for iSeries by PKWARE Length Method Size Cmpr Date Time CRC-32 Name 2113482674 Defl:S 204502989 90% 2010-11-01 08:39 2cee662e myfile.txt 2113482674 204502989 90% 1 file
有什么想法可以解决这个问题吗?如果我能让python的zipfile运行起来就好了,因为我已经有了一些单元测试,如果我要切换到运行subprocess.call("unzip")的话,我将不得不放弃这些单元测试
发布于 2011-09-18 04:49:51
在我的文件中遇到了同样的问题,并能够解决它。我不确定它们是如何生成的,就像上面的例子一样。它们的尾随数据最终都被Windows和失败的python的zipfile忽略了7z。
下面是解决这个问题的代码:
def fixBadZipfile(zipFile):
f = open(zipFile, 'r+b')
data = f.read()
pos = data.find('\x50\x4b\x05\x06') # End of central directory signature
if (pos > 0):
self._log("Truncating file at location " + str(pos + 22) + ".")
f.seek(pos + 22) # size of 'ZIP end of central directory record'
f.truncate()
f.close()
else:
# raise error, file is truncated 发布于 2011-04-05 18:37:12
你说在它显示的文件上使用less。你是说这个吗?
less my_file如果是这样的话,我猜这些是zip程序放在文件中的注释。查看我在网上找到的iSeries PKZIP的用户指南,这似乎是默认行为。
zipfile的文档说:“这个模块目前不能处理附加了注释的ZIP文件。”也许这就是问题所在?(当然,如果less显示了它们,这似乎意味着它们是预先考虑的。)
似乎您(或在iSeries机器上创建zipfile的人)可以使用ARCHTEXT(*NONE)关闭此功能,或者使用ARCHTEXT(*CLEAR)将其从现有zipfile中删除。
发布于 2018-12-10 23:34:35
# Utilize mmap module to avoid a potential DoS exploit (e.g. by reading the
# whole zip file into memory). A bad zip file example can be found here:
# https://bugs.python.org/issue24621
import mmap
from io import UnsupportedOperation
from zipfile import BadZipfile
# The end of central directory signature
CENTRAL_DIRECTORY_SIGNATURE = b'\x50\x4b\x05\x06'
def repair_central_directory(zipFile):
if hasattr(zipFile, 'read'):
# This is a file-like object
f = zipFile
try:
fileno = f.fileno()
except UnsupportedOperation:
# This is an io.BytesIO instance which lacks a backing file.
fileno = None
else:
# Otherwise, open the file with binary mode
f = open(zipFile, 'rb+')
fileno = f.fileno()
if fileno is None:
# Without a fileno, we can only read and search the whole string
# for the end of central directory signature.
f.seek(0)
pos = f.read().find(CENTRAL_DIRECTORY_SIGNATURE)
else:
# Instead of reading the entire file into memory, memory-mapped the
# file, then search it for the end of central directory signature.
# Reference: https://stackoverflow.com/a/21844624/2293304
mm = mmap.mmap(fileno, 0)
pos = mm.find(CENTRAL_DIRECTORY_SIGNATURE)
mm.close()
if pos > -1:
# size of 'ZIP end of central directory record'
f.truncate(pos + 22)
f.seek(0)
return f
else:
# Raise an error to make it fail fast
raise BadZipfile('File is not a zip file')https://stackoverflow.com/questions/4923142
复制相似问题