文章/答案/技术大牛

发布

社区首页 >问答首页 >Python mmap.mmap()到类似字节的对象？

问Python mmap.mmap()到类似字节的对象？
EN

Stack Overflow用户

提问于 2021-05-29 11:55:11

回答 1查看 943关注 0票数 2

mmap的文档说：“内存映射的文件对象的行为既像bytearray，也像文件对象。”

然而，这似乎并没有扩展到标准的for循环:至少对于我目前使用的Linux上的Python3.8.5，每个mmap.mmap()迭代器元素都是一个单字节bytes，而对于bytearray和普通文件访问来说，每个元素都是一个int。更新.更正:对于正常的文件访问，它是一个可变大小的bytes；参见下面.

为什么会这样呢？更重要的是，我如何从mmap中高效地获得一个类似字节的对象，这样一个不仅索引而且for也给我一个int的对象？(我的意思是说，我想避免额外的复制、铸造等。)

下面是演示这种行为的代码：

#!/usr/bin/env python3.8

def print_types(desc, x):
    for el in setmm: break   ### UPDATE: bug here, `setmm` should be `x`, see comments
    # `el` is now the first element of `x`
    print('%-30s: type is %-30s, first element is %s' % (desc,type(x),type(el)))
    try: print('%72s(first element size is %d)' % (' ', len(el)))
    except: pass # ignore failure if `el` doesn't support `len()`

setmm = bytearray(b'hoi!')
print_types('bytearray', setmm)

with open('set.mm', 'rb') as f:
    print_types('file object', f)

with open('set.mm', 'rb') as f:
    setmm = f.read()
    print_types('file open().read() result', setmm)

import mmap
with open('set.mm', 'rb') as f:
    setmm = mmap.mmap(f.fileno(), 0, prot=mmap.PROT_READ)
    print_types('file mmap.mmap() result', setmm)

这会导致

bytearray                     : type is <class 'bytearray'>           , first element type is <class 'int'>
file object                   : type is <class '_io.BufferedReader'>  , first element type is <class 'int'>
file open().read() result     : type is <class 'bytes'>               , first element type is <class 'int'>
file mmap.mmap() result       : type is <class 'mmap.mmap'>           , first element type is <class 'bytes'>
                                                                        (first element size is 1)

更新.修复了弗拉斯在注释中善意指出的错误，结果如下

bytearray                     : type is <class 'bytearray'>           , first element is <class 'int'>
file object                   : type is <class '_io.BufferedReader'>  , first element is <class 'bytes'>
                                                                        (first element size is 38)
file open().read() result     : type is <class 'bytes'>               , first element is <class 'int'>
file mmap.mmap() result       : type is <class 'mmap.mmap'>           , first element is <class 'bytes'>
                                                                        (first element size is 1)

这就回答了所发生的事情:出于某种原因，迭代mmap就像迭代文件，每次返回一个bytes，但不是像文件那样的完整行，而是单字节块。

然而，我的主要问题仍然没有改变:如何有效地让mmap行为像一个类似字节的对象(即索引和for都给出int)？

iterator

mmap

python

python-3.x

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-05-30 09:51:16

如何有效地使mmap行为像一个类似字节的对象(即索引和给int)？

bytes是一个在内存中包含数据的对象。但是mmap的全部目的是不将所有数据加载到内存中。

如果要获得包含文件的整个内容的bytes对象，请将文件作为常规open()，并对整个内容进行read()。为此使用mmap()是对自己不利的。

也许您希望使用memoryview，它可以由bytes或mmap()构建，并将为您提供一个统一的API。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/67751074

复制

相似问题

问Python mmap.mmap()到类似字节的对象？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python mmap.mmap()到类似字节的对象？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python mmap.mmap()到类似字节的对象？
EN