当在16 my上本地运行我的程序时,我得到以下情况:
15/04/10 20:07:50 INFO BlockManagerMaster: Updated info of block rdd_12_3
15/04/10 20:07:50 INFO BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 50331648, targetRequestSize: 10066329
15/04/10 20:07:50 INFO BlockManagerInfo: Added rdd_12_6 in memory on 192.168.1.4:60005 (size: 854.0 KB, free: 682.9 MB)
15/04/10 20:07:50 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 8 non-empty blocks out of 8 blocks
15/04/10 20:07:50 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 0 ms
15/04/10 20:07:50 INFO BlockManagerMaster: Updated info of block rdd_12_6
15/04/10 20:07:50 INFO BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 50331648, targetRequestSize: 10066329
15/04/10 20:07:50 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 8 non-empty blocks out of 8 blocks
15/04/10 20:07:50 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 0 ms
15/04/10 20:07:50 INFO ExternalAppendOnlyMap: Thread 67 spilling in-memory batch of 7.9 MB to disk (1 times so far)
15/04/10 20:07:50 INFO ExternalAppendOnlyMap: Thread 95 spilling in-memory batch of 5.0 MB to disk (1 times so far)
15/04/10 20:07:50 INFO ExternalAppendOnlyMap: Thread 66 spilling in-memory batch of 8.0 MB to disk (1 times so far)
15/04/10 20:07:50 INFO ExternalAppendOnlyMap: Thread 95 spilling in-memory batch of 5.0 MB to disk (2 timess so far)
15/04/10 20:07:50 INFO ExternalAppendOnlyMap: Thread 65 spilling in-memory batch of 5.8 MB to disk (1 times so far)
15/04/10 20:07:51 INFO ExternalAppendOnlyMap: Thread 67 spilling in-memory batch of 5.2 MB to disk (2 timess so far)
15/04/10 20:07:51 INFO ExternalAppendOnlyMap: Thread 66 spilling in-memory batch of 5.6 MB to disk (2 timess so far)
15/04/10 20:07:51 INFO ExternalAppendOnlyMap: Thread 95 spilling in-memory batch of 5.0 MB to disk (3 timess so far)
15/04/10 20:07:51 INFO ExternalAppendOnlyMap: Thread 65 spilling in-memory batch of 5.0 MB to disk (2 timess so far)
15/04/10 20:07:51 INFO ExternalAppendOnlyMap: Thread 61 spilling in-memory batch of 24.3 MB to disk (1 times so far)
15/04/10 20:07:52 INFO ExternalAppendOnlyMap: Thread 67 spilling in-memory batch of 5.0 MB to disk (3 timess so far)
15/04/10 20:07:52 INFO ExternalAppendOnlyMap: Thread 66 spilling in-memory batch of 5.0 MB to disk (3 timess so far)
15/04/10 20:07:52 INFO ExternalAppendOnlyMap: Thread 95 spilling in-memory batch of 5.0 MB to disk (4 timess so far)
15/04/10 20:07:52 INFO ExternalAppendOnlyMap: Thread 65 spilling in-memory batch of 5.3 MB to disk (3 timess so far)
15/04/10 20:07:52 INFO ExternalAppendOnlyMap: Thread 66 spilling in-memory batch of 5.0 MB to disk (4 timess so far)
15/04/10 20:07:52 INFO ExternalAppendOnlyMap: Thread 95 spilling in-memory batch of 5.2 MB to disk (5 timess so far)
15/04/10 20:07:52 INFO ExternalAppendOnlyMap: Thread 67 spilling in-memory batch of 5.8 MB to disk (4 timess so far)
15/04/10 20:07:53 INFO ExternalAppendOnlyMap: Thread 63 spilling in-memory batch of 35.6 MB to disk (1 times so far)
15/04/10 20:07:53 INFO ExternalAppendOnlyMap: Thread 65 spilling in-memory batch of 5.0 MB to disk (4 timess so far)
15/04/10 20:07:53 INFO ExternalAppendOnlyMap: Thread 66 spilling in-memory batch of 5.0 MB to disk (5 timess so far)
15/04/10 20:07:53 INFO ExternalAppendOnlyMap: Thread 95 spilling in-memory batch of 5.0 MB to disk (6 timess so far)
15/04/10 20:07:53 INFO MemoryStore: ensureFreeSpace(872616) called with curMem=1345765155, maxMem=2061647216
15/04/10 20:07:53 INFO MemoryStore: Block rdd_12_2 stored as values in memory (estimated size 852.2 KB, free 681.9 MB)
15/04/10 20:07:53 INFO BlockManagerInfo: Added rdd_12_2 in memory on 192.168.1.4:60005 (size: 852.2 KB, free: 682.0 MB)
15/04/10 20:07:53 INFO BlockManagerMaster: Updated info of block rdd_12_2我的理解是,它是否有空闲的内存,实际上大部分内存都是空闲的;由:
15/04/10 20:07:50 INFO BlockManagerInfo: Added rdd_12_6 in memory on 192.168.1.4:60005 (size: 854.0 KB, free: 682.9 MB)但是它正在溢出到磁盘上吗?我使用的是~265Mb的数据集,所以它真的不需要溢出到磁盘吗?
值得一提的是:
15/04/10 20:06:50 INFO MemoryStore: MemoryStore started with capacity 1966.1 MB在所有这些溢出到磁盘的情况下,运行一次我的程序需要大约5分钟。
为什么会发生这种情况?
发布于 2017-01-27 02:36:29
我发现我的一个专栏自始至终都有nulls,这导致了一个偏差,这导致了不断的溢出。
发布于 2015-04-11 05:57:49
有不同的记忆领域在发挥作用。对于缓存,Spark使用堆的spark.storage.memoryFraction (缺省为60%)。这就是大多数“可用内存”消息的内容。它使用堆的spark.shuffle.memoryFraction (缺省为20%)进行混洗。我想这就是泄漏信息的意义所在。通过将spark.shuffle.spill设置为false (默认为true),可以完全禁用随机溢出。
我不知道这是否能解释你所看到的一切。有关所有此类参数的说明,请参阅http://spark.apache.org/docs/latest/configuration.html。
https://stackoverflow.com/questions/29569239
复制相似问题