文章/答案/技术大牛

发布

社区首页 >问答首页 >jemalloc是如何工作的？有什么福利待遇？

问jemalloc是如何工作的？有什么福利待遇？
EN

Stack Overflow用户

提问于 2009-10-26 21:17:21

回答 4查看 47.8K关注 0票数 80

火狐3带来了一个新的分配器：jemalloc。

我在几个地方听说这个新的分配器更好。排名靠前的谷歌搜索结果没有给出任何进一步的信息，但我对它到底是如何工作的很感兴趣。

firefox

malloc

回答 4

Stack Overflow用户

回答已采纳

发布于 2009-10-26 21:21:07

jemalloc最初是为FreeBSD而出现的，这是一个"Jason Evans“的创意，因此出现了"je”。如果我不是曾经写过一个叫paxos的操作系统，我会嘲笑他的自负:-)

有关详细信息，请参阅this PDF。这是一份白皮书，详细描述了算法是如何工作的。

主要的好处是在多处理器和多线程系统中实现了可伸缩性，部分是通过使用多个arenas (从中进行分配的原始内存块)实现的。

在单线程的情况下，多个竞技场没有真正的好处，所以只能使用一个竞技场。

但是，在多线程的情况下，会创建许多arenas ( arenas是处理器数量的四倍)，并以循环方式将线程分配给这些arena。

这意味着锁争用可以减少，因为虽然多个线程可以并发调用malloc或free，但它们只有在共享同一竞技场时才会竞争。具有不同arenas的两个线程不会相互影响。

此外，jemalloc还尝试针对缓存位置进行优化，因为从内存获取数据的操作比使用CPU缓存中已有的数据要慢得多(在概念上与从内存快速获取和从磁盘缓慢获取之间的区别没有区别)。为此，它首先尝试最小化整体内存使用，因为这更有可能确保应用程序的整个工作集都在缓存中。

而且，在无法实现这一点的情况下，它会尝试确保分配是连续的，因为一起分配的内存往往会一起使用。

从白皮书来看，这些策略似乎为单线程使用提供了与当前最佳算法类似的性能，同时为多线程使用提供了改进。

票数 136

Stack Overflow用户

发布于 2009-10-26 21:20:40

有一个有趣的来源:C源代码本身：https://dxr.mozilla.org/mozilla-central/source/memory/build/mozjemalloc.cpp (old)

在一开始，一个简短的总结大致描述了它是如何工作的。

// This allocator implementation is designed to provide scalable performance
// for multi-threaded programs on multi-processor systems.  The following
// features are included for this purpose:
//
//   + Multiple arenas are used if there are multiple CPUs, which reduces lock
//     contention and cache sloshing.
//
//   + Cache line sharing between arenas is avoided for internal data
//     structures.
//
//   + Memory is managed in chunks and runs (chunks can be split into runs),
//     rather than as individual pages.  This provides a constant-time
//     mechanism for associating allocations with particular arenas.
//
// Allocation requests are rounded up to the nearest size class, and no record
// of the original request size is maintained.  Allocations are broken into
// categories according to size class.  Assuming runtime defaults, 4 kB pages
// and a 16 byte quantum on a 32-bit system, the size classes in each category
// are as follows:
//
//   |=====================================|
//   | Category | Subcategory    |    Size |
//   |=====================================|
//   | Small    | Tiny           |       4 |
//   |          |                |       8 |
//   |          |----------------+---------|
//   |          | Quantum-spaced |      16 |
//   |          |                |      32 |
//   |          |                |      48 |
//   |          |                |     ... |
//   |          |                |     480 |
//   |          |                |     496 |
//   |          |                |     512 |
//   |          |----------------+---------|
//   |          | Sub-page       |    1 kB |
//   |          |                |    2 kB |
//   |=====================================|
//   | Large                     |    4 kB |
//   |                           |    8 kB |
//   |                           |   12 kB |
//   |                           |     ... |
//   |                           | 1012 kB |
//   |                           | 1016 kB |
//   |                           | 1020 kB |
//   |=====================================|
//   | Huge                      |    1 MB |
//   |                           |    2 MB |
//   |                           |    3 MB |
//   |                           |     ... |
//   |=====================================|
//
// NOTE: Due to Mozilla bug 691003, we cannot reserve less than one word for an
// allocation on Linux or Mac.  So on 32-bit *nix, the smallest bucket size is
// 4 bytes, and on 64-bit, the smallest bucket size is 8 bytes.
//
// A different mechanism is used for each category:
//
//   Small : Each size class is segregated into its own set of runs.  Each run
//           maintains a bitmap of which regions are free/allocated.
//
//   Large : Each allocation is backed by a dedicated run.  Metadata are stored
//           in the associated arena chunk header maps.
//
//   Huge : Each allocation is backed by a dedicated contiguous set of chunks.
//          Metadata are stored in a separate red-black tree.
//
// *****************************************************************************

然而，缺少更深入的算法分析。

票数 14

Stack Overflow用户

发布于 2009-10-26 21:40:03

至于jemalloc给mozilla带来了什么好处，根据http://blog.pavlov.net/2008/03/11/firefox-3-memory-usage/ (也是谷歌在mozilla+jemalloc上的第一个结果)：

jemalloc在运行了很长一段时间后，...concluded给了我们最小的碎片。..。我们在Windows Vista上的自动化测试显示，当我们打开jemalloc时，的内存使用率下降了22%。

票数 5

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/1624726

复制

相似问题

问jemalloc是如何工作的？有什么福利待遇？
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问jemalloc是如何工作的？有什么福利待遇？EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问jemalloc是如何工作的？有什么福利待遇？
EN