文章/答案/技术大牛

发布

社区首页 >问答首页 >Python2.7 pyLZMA可以工作，Python3.4 LZMA模块不能工作

问Python2.7 pyLZMA可以工作，Python3.4 LZMA模块不能工作
EN

Stack Overflow用户

提问于 2015-09-26 01:39:37

回答 1查看 3.3K关注 0票数 4

import sys
import os
import zlib

try:
    import pylzma as lzma
except ImportError:
    import lzma

from io import StringIO
import struct

#-----------------------------------------------------------------------------------------------------------------------

def read_ui8(c):
    return struct.unpack('<B', c)[0]
def read_ui16(c):
    return struct.unpack('<H', c)[0]
def read_ui32(c):
    return struct.unpack('<I', c)[0]

def parse(input):
    """Parses the header information from an SWF file."""
    if hasattr(input, 'read'):
        input.seek(0)
    else:
        input = open(input, 'rb')

    header = { }

    # Read the 3-byte signature field
    header['signature'] = signature = b''.join(struct.unpack('<3c', input.read(3))).decode()

    # Version
    header['version'] = read_ui8(input.read(1))

    # File size (stored as a 32-bit integer)
    header['size'] = read_ui32(input.read(4))

    # Payload

    if header['signature'] == 'FWS':
        print("The opened file doesn't appear to be compressed")
        buffer = input.read(header['size'])
    elif header['signature'] == 'CWS':
        print("The opened file appears to be compressed with Zlib")
        buffer = zlib.decompress(input.read(header['size']))
    elif header['signature'] == 'ZWS':
        print("The opened file appears to be compressed with Lzma")
        # ZWS(LZMA)
        # | 4 bytes       | 4 bytes    | 4 bytes       | 5 bytes    | n bytes    | 6 bytes         |
        # | 'ZWS'+version | scriptLen  | compressedLen | LZMA props | LZMA data  | LZMA end marker |
        size = read_ui32(input.read(4))
        buffer = lzma.decompress(input.read())

    # Containing rectangle (struct RECT)

    # The number of bits used to store the each of the RECT values are
    # stored in first five bits of the first byte.

    nbits = read_ui8(buffer[0]) >> 3

    current_byte, buffer = read_ui8(buffer[0]), buffer[1:]
    bit_cursor = 5

    for item in 'xmin', 'xmax', 'ymin', 'ymax':
        value = 0
        for value_bit in range(nbits-1, -1, -1): # == reversed(range(nbits))
            if (current_byte << bit_cursor) & 0x80:
                value |= 1 << value_bit
            # Advance the bit cursor to the next bit
            bit_cursor += 1

            if bit_cursor > 7:
                # We've exhausted the current byte, consume the next one
                # from the buffer.
                current_byte, buffer = read_ui8(buffer[0]), buffer[1:]
                bit_cursor = 0

        # Convert value from TWIPS to a pixel value
        header[item] = value / 20

    header['width'] = header['xmax'] - header['xmin']
    header['height'] = header['ymax'] - header['ymin']

    header['frames'] = read_ui16(buffer[0:2])
    header['fps'] = read_ui16(buffer[2:4])

    input.close()
    return header

header = parse(sys.argv[1]);

print('SWF header')
print('----------')
print('Version:      %s' % header['version'])
print('Signature:    %s' % header['signature'])
print('Dimensions:   %s x %s' % (header['width'], header['height']))
print('Bounding box: (%s, %s, %s, %s)' % (header['xmin'], header['xmax'], header['ymin'], header['ymax']))
print('Frames:       %s' % header['frames'])
print('FPS:          %s' % header['fps'])

我的印象是，内置的Python3.4LZMA模块与Python2.7 pyLZMA模块的工作方式相同。我提供的代码在2.7和3.4上都可以运行，但是当它在3.4上运行时(它没有pylzma，所以它求助于内置的lzma)，我得到了以下错误：

_lzma.LZMAError: Input format not supported by decoder

为什么pylzma可以工作，而Python3.4的lzma不能？

python

compression

lzma

回答 1

Stack Overflow用户

发布于 2016-09-30 02:22:44

虽然我不知道为什么这两个模块的工作方式不同，但我确实有一个解决方案。

我无法让非流LZMA lzma.decompress工作，因为我对LZMA/XZ/SWF规范没有足够的了解，但是我让lzma.LZMADecompressor工作了。为了完整性，我相信SWF LZMA使用这个头格式(没有100%确认)：

Bytes  Length  Type  Endianness  Description
 0- 2  3       UI8   -           SWF Signature: ZWS
 3     1       UI8   -           SWF Version
 4- 7  4       UI32  LE          SWF FileLength aka File Size

 8-11  4       UI32  LE          SWF? Compressed Size (File Size - 17)

12     1       -     -           LZMA Decoder Properties
13-16  4       UI32  LE          LZMA Dictionary Size
17-    -       -     -           LZMA Compressed Data (including rest of SWF header)

然而，LZMA文件格式规范说它应该是：

Bytes  Length  Type  Endianness  Description
 0     1       -     -           LZMA Decoder Properties
 1- 4  4       UI32  LE          LZMA Dictionary Size
 5-12  8       UI64  LE          LZMA Uncompressed Size
13-    -       -     -           LZMA Compressed Data

我从来没有真正弄清楚Uncompressed Size应该是什么(如果可以为这种格式定义的话)。pylzma似乎并不关心这一点，而Python3.3 lzma关心。但是，显式的未知大小似乎是有效的，并且可以指定为具有值2^64的UI64，例如8*b'\xff'或8*'\xff'，因此可以通过稍微调整报头，而不是使用：

buffer = lzma.decompress(input.read())

尝试：

d = lzma.LZMADecompressor(format=lzma.FORMAT_ALONE)
buffer = d.decompress(input.read(5) + 8*b'\xff' + input.read())

注意:我没有可用的本地python3解释器，所以只使用略微修改的读取过程在线测试它，所以它可能无法开箱即用。

编辑:确认可以在python3中工作，但有些事情需要改变，比如Marcus提到的解包(使用buffer[0:1]而不是buffer[0]很容易解决)。其实也没有必要读取整个文件，比如说256字节就足够读取整个SWF头文件了。frames字段也有点奇怪，尽管我相信你所要做的就是做一些位移位，例如：

header['frames'] = read_ui16(buffer[0:2]) >> 8

SWF file format spec

LZMA file format spec

票数 5

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/32787778

复制

相似问题

问Python2.7 pyLZMA可以工作，Python3.4 LZMA模块不能工作
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python2.7 pyLZMA可以工作，Python3.4 LZMA模块不能工作EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python2.7 pyLZMA可以工作，Python3.4 LZMA模块不能工作
EN