问如何为不泄漏内存的大型生成器编写使用者？
EN

Stack Overflow用户

提问于 2018-12-27 21:22:11

回答 1查看 426关注 0票数 0

TL/DR: ThreadPoolExecutor是其原因。Memory usage with concurrent.futures.ThreadPoolExecutor in Python3

这里有一个Python脚本(简化了很多)，它运行所有到所有的路由算法，在这个过程中它消耗了所有的内存。

我了解到问题在于主函数不返回，并且在其中创建的对象没有被垃圾收集器清除。

我的主要问题是:是否可以为返回的生成器编写一个使用者，以便清理数据？或者我应该直接打电话给垃圾收集器实用程序？

# thread pool executor like in python documentation example
def table_process(callable, total):
    with ThreadPoolExecutor(max_workers=threads) as e:
    future_map = {
        e.submit(callable, i): i
        for i in range(total)
    }

    for future in as_completed(future_map):
        if future.exception() is None:
            yield future.result()
        else:
            raise future.exception()

@argh.dispatch_command
def main():
    threads = 10
    data = pd.DataFrame(...)  # about 12K rows

    # this function routes only one slice of sources/destinations
    def _process_chunk(x:int) -> gpd.GeoDataFrame:
        # slicing is more complex, but simplified here for presentation
        # do cross-product and an http request to process the result
        result_df = _do_process(grid[x], grid)
        return result_df

    # writing to geopackage
    with fiona.open('/tmp/some_file.gpkg', 'w', driver='GPKG', schema=...) as f:
        for results_df in table_process(_process_chunk, len(data)):
            aggregated_df = results_df.groupby('...').aggregate({...})
            f.writerecords(aggregated_df)

python

garbage-collection

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-12-28 18:57:03

原来是ThreadPoolExecutor保留了工作人员，不释放内存。

解决方案如下：Memory usage with concurrent.futures.ThreadPoolExecutor in Python3

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/53951005

复制

相似问题

问如何为不泄漏内存的大型生成器编写使用者？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何为不泄漏内存的大型生成器编写使用者？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何为不泄漏内存的大型生成器编写使用者？
EN