我试图在产品环境中使用zgc,所以我将jdk从jdk8更新为openjdk 15,将tomcat 8更新为tomcat 8.5,以及与gc相关的选项,但是在jvm启动几小时后,1000+%的cpu使用率将为100%-300%。有时,当cpu较高时,日志文件中有许多ICBufferFull安全点:
[2020-11-12T19:00:43.669+0800] Safepoint "ICBufferFull", Time since last: 41374119 ns, Reaching safepoint: 2026134 ns, At safepoint: 85993 ns, Total: 2112127 ns
[2020-11-12T19:00:43.672+0800] Safepoint "ICBufferFull", Time since last: 2521598 ns, Reaching safepoint: 1109894 ns, At safepoint: 57235 ns, Total: 1167129 ns
[2020-11-12T19:00:43.676+0800] Safepoint "ICBufferFull", Time since last: 2892867 ns, Reaching safepoint: 1240834 ns, At safepoint: 59633 ns, Total: 1300467 ns
[2020-11-12T19:00:43.681+0800] Safepoint "ICBufferFull", Time since last: 2870110 ns, Reaching safepoint: 1425837 ns, At safepoint: 54052 ns, Total: 1479889 ns如果使节点离线大约30,cpu下降,然后使它在线,它将正常工作数小时,直到cpu再次上升。我检查了源代码,ICBufferFull的意思是“内联缓存缓冲区已满”,但我找不到增加它的选项,有人能帮上忙吗?谢谢!
gc选项如下:
export JAVA_OPTS='-Xms10g -Xmx10g -XX:+UseLargePages -XX:ZAllocationSpikeTolerance=5 -XX:ParallelGCThreads=8 -XX:ConcGCThreads=4 -Xss2m -XX:+UseZGC -Xlog:gc,gc+phases,safepoint:file=/logs/gc.log:t:filecount=10,filesize=10m -XX:+HeapDumpOnOutOfMemoryError'edit1:
我有另一台主机在jdk8中运行良好,以cms为参考,对2台主机的请求几乎相同。
我用异步配置文件对其进行了描述,最热门的方法是java/lang/ThreadLocal$ThreadLocalMap.getEntryAfterMiss,它发生了50+%,最热门的原生方法是ZMark::try_mark_object(ZMarkCache*, unsigned long, bool),它只发生了0.41%。我在jdk8和openjdk15中检查了线程本地相关的源代码,似乎没有改变。
edit2:
我在 JIRA上发现了一个类似的bug,我的应用程序也与lucene相关,从gc日志来看,当弱引用计数为1m+时,使用率很高。因此,问题是如何在zgc?中更积极地收集弱引用。
edit3:
从下面每3s调用一次System.gc()的日志中,我的应用程序似乎产生了太多弱引用。但奇怪的是,生产数量在开始后还在不断增加。从11点到17点,请求几乎是不变的。请注意,GC(9821)后cpu从600%自动下降到400%,排队时间为250 K。GC(10265)节点离线,排队时间为770K。为什么在很长一段时间内,排队的数量很小,这是否意味着这些对象没有被完全回收?
[2020-11-19T11:00:00.245+0800] GC(992) Weak: 155658 encountered, 72334 discovered, 0 enqueued
[2020-11-19T12:00:00.397+0800] GC(2194) Weak: 220462 encountered, 122216 discovered, 1380 enqueued
[2020-11-19T12:00:03.411+0800] GC(2195) Weak: 220598 encountered, 107228 discovered, 677 enqueued
[2020-11-19T13:00:00.497+0800] GC(3395) Weak: 222536 encountered, 82199 discovered, 1713 enqueued
[2020-11-19T14:00:00.647+0800] GC(4613) Weak: 443946 encountered, 291651 discovered, 292 enqueued
[2020-11-19T15:00:01.173+0800] GC(5819) Weak: 338065 encountered, 124351 discovered, 815 enqueued
[2020-11-19T16:00:01.283+0800] GC(7022) Weak: 459070 encountered, 298932 discovered, 353 enqueued
[2020-11-19T17:00:01.426+0800] GC(8222) Weak: 688162 encountered, 519369 discovered, 4648 enqueued
[2020-11-19T16:00:01.283+0800] GC(7022) Weak: 459070 encountered, 298932 discovered, 353 enqueued
[2020-11-19T17:00:01.426+0800] GC(8222) Weak: 688162 encountered, 519369 discovered, 4648 enqueued
[2020-11-19T18:00:01.556+0800] GC(9430) Weak: 1078757 encountered, 928748 discovered, 1691 enqueued
[2020-11-19T18:18:43.595+0800] GC(9821) Weak: 1022080 encountered, 841168 discovered, 247352 enqueued
[2020-11-19T18:18:46.592+0800] GC(9822) Weak: 774253 encountered, 568564 discovered, 3938 enqueued
[2020-11-19T18:40:49.616+0800] GC(10265) Weak: 842081 encountered, 788825 discovered, 767288 enqueued
[2020-11-19T18:40:52.593+0800] GC(10266) Weak: 74876 encountered, 18186 discovered, 1 enqueued发布于 2020-11-16 18:02:16
现代GCs延迟了弱可达对象的收集.
由System.gc()触发的集合总是处理弱可达的对象,默认情况下是并发的,因此您可以实现调用该方法的周期性任务。
发布于 2020-11-27 11:43:17
这个问题最终被证明是一个jdk发行,它应该在jdk16中解决,并且可以通过创建一个线程池来绕过它,该线程池将终止旧线程并周期性地生成新线程。由于某种原因,我将应用程序移到了jetty,修改后的jetty线程池是这里。现在它完美地工作了几天,如果有人遇到同样的问题,就拿这个做吧。
https://stackoverflow.com/questions/64815418
复制相似问题