我们有一个客户端应用程序(经过10+多年的开发)。它的JDK最近从OpenJDK 11升级到OpenJDK 14。与Java 11相比,在单CPU (超级线程禁用) Windows 10设置(以及在只有一个可用CPU的VirtualBox机器中)上,应用程序的启动速度非常慢。此外,它大部分时间使用100%的CPU。我们还可以将处理器关联设置为一个CPU (c:\windows\system32\cmd.exe /C start /affinity 1 ...)。
在我的VirtualBox机器中启动应用程序并执行最少手动交互的查询,可以进行一些度量:
-XX:-UseBiasedLocking:46秒-XX:-ThreadLocalHandshakes:40秒-XX:-UseBiasedLocking:3-3,5分钟只更改了使用过的JDK (和上述选项)。(-XX:-ThreadLocalHandshakes在Java14中不可用。)
我们已经尝试记录JDK 14使用-Xlog:all=debug:file=app.txt:uptime,tid,level,tags:filecount=50所做的事情。
用OpenJDK 11.0.2计算每秒的日志行似乎相当顺利:
$ cat jdk11-log/app* | grep "^\[" | cut -d. -f 1 | cut -d[ -f 2 | sort | uniq -c | sort -k 2 -n
30710 0
44012 1
55461 2
55974 3
27182 4
41292 5
43796 6
51889 7
54170 8
58850 9
51422 10
44378 11
41405 12
53589 13
41696 14
29526 15
2350 16
50228 17
62623 18
42684 19
45045 20另一方面,OpenJDK 14似乎有有趣的安静期:
$ cat jdk14-log/app* | grep "^\[" | cut -d. -f 1 | cut -d[ -f 2 | sort | uniq -c | sort -k 2 -n
7726 0
1715 5
10744 6
4341 11
42792 12
45979 13
38783 14
17253 21
34747 22
1025 28
2079 33
2398 39
3016 44那么,在1-4,7-10和14-20之间发生了什么?
...
[0.350s][7248][debug][class,resolve ] jdk.internal.ref.CleanerFactory$1 java.lang.Thread CleanerFactory.java:45
[0.350s][7248][debug][class,resolve ] jdk.internal.ref.CleanerImpl java.lang.Thread CleanerImpl.java:117
[0.350s][7248][info ][biasedlocking ] Aligned thread 0x000000001727e010 to 0x000000001727e800
[0.350s][7248][info ][os,thread ] Thread started (tid: 2944, attributes: stacksize: default, flags: CREATE_SUSPENDED STACK_SIZE_PARAM_IS)
[0.350s][6884][info ][os,thread ] Thread is alive (tid: 6884).
[0.350s][6884][debug][os,thread ] Thread 6884 stack dimensions: 0x00000000175b0000-0x00000000176b0000 (1024k).
[0.350s][6884][debug][os,thread ] Thread 6884 stack guard pages activated: 0x00000000175b0000-0x00000000175b4000.
[0.350s][7248][debug][thread,smr ] tid=7248: Threads::add: new ThreadsList=0x0000000017254500
[0.350s][7248][debug][thread,smr ] tid=7248: ThreadsSMRSupport::free_list: threads=0x0000000017253d50 is freed.
[0.350s][2944][info ][os,thread ] Thread is alive (tid: 2944).
[0.350s][2944][debug][os,thread ] Thread 2944 stack dimensions: 0x00000000177b0000-0x00000000178b0000 (1024k).
[0.350s][2944][debug][os,thread ] Thread 2944 stack guard pages activated: 0x00000000177b0000-0x00000000177b4000.
[0.351s][2944][debug][class,resolve ] java.lang.Thread java.lang.Runnable Thread.java:832
[0.351s][2944][debug][class,resolve ] jdk.internal.ref.CleanerImpl jdk.internal.misc.InnocuousThread CleanerImpl.java:135
[0.351s][2944][debug][class,resolve ] jdk.internal.ref.CleanerImpl jdk.internal.ref.PhantomCleanable CleanerImpl.java:138
[0.351s][2944][info ][biasedlocking,handshake] JavaThread 0x000000001727e800 handshaking JavaThread 0x000000000286d800 to revoke object 0x00000000c0087f78
[0.351s][2944][debug][vmthread ] Adding VM operation: HandshakeOneThread
[0.351s][6708][debug][vmthread ] Evaluating non-safepoint VM operation: HandshakeOneThread
[0.351s][6708][debug][vmoperation ] begin VM_Operation (0x00000000178af250): HandshakeOneThread, mode: no safepoint, requested by thread 0x000000001727e800
# no log until 5.723s
[5.723s][7248][info ][biasedlocking ] Revoked bias of currently-unlocked object
[5.723s][7248][debug][handshake,task ] Operation: RevokeOneBias for thread 0x000000000286d800, is_vm_thread: false, completed in 94800 ns
[5.723s][7248][debug][class,resolve ] java.util.zip.ZipFile$CleanableResource java.lang.ref.Cleaner ZipFile.java:715
[5.723s][7248][debug][class,resolve ] java.lang.ref.Cleaner jdk.internal.ref.CleanerImpl$PhantomCleanableRef Cleaner.java:220
[5.723s][7248][debug][class,resolve ] java.util.zip.ZipFile$CleanableResource java.util.WeakHashMap ZipFile.java:716
...第二次停顿稍晚一点:
...
[6.246s][7248][info ][class,load ] java.awt.Graphics source: jrt:/java.desktop
[6.246s][7248][debug][class,load ] klass: 0x0000000100081a00 super: 0x0000000100001080 loader: [loader data: 0x0000000002882bd0 of 'bootstrap'] bytes: 5625 checksum: 0025818f
[6.246s][7248][debug][class,resolve ] java.awt.Graphics java.lang.Object (super)
[6.246s][7248][info ][class,loader,constraints] updating constraint for name java/awt/Graphics, loader 'bootstrap', by setting class object
[6.246s][7248][debug][jit,compilation ] 19 4 java.lang.Object::<init> (1 bytes) made not entrant
[6.246s][7248][debug][vmthread ] Adding VM operation: HandshakeAllThreads
[6.246s][6708][debug][vmthread ] Evaluating non-safepoint VM operation: HandshakeAllThreads
[6.246s][6708][debug][vmoperation ] begin VM_Operation (0x000000000203ddf8): HandshakeAllThreads, mode: no safepoint, requested by thread 0x000000000286d800
[6.246s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026b0800, is_vm_thread: true, completed in 1400 ns
[6.246s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026bb800, is_vm_thread: true, completed in 700 ns
[6.246s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026ef800, is_vm_thread: true, completed in 100 ns
[6.246s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026f0800, is_vm_thread: true, completed in 100 ns
[6.246s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026f1800, is_vm_thread: true, completed in 100 ns
[6.246s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026f4800, is_vm_thread: true, completed in 100 ns
[6.247s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x0000000002768800, is_vm_thread: true, completed in 100 ns
[6.247s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x000000000276e000, is_vm_thread: true, completed in 100 ns
[6.247s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x0000000017268800, is_vm_thread: true, completed in 100 ns
[6.247s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x000000001727e800, is_vm_thread: true, completed in 800 ns
# no log until 11.783s
[11.783s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x000000000286d800, is_vm_thread: true, completed in 6300 ns
[11.783s][6708][info ][handshake ] Handshake "Deoptimize", Targeted threads: 11, Executed by targeted threads: 0, Total completion time: 5536442500 ns
[11.783s][6708][debug][vmoperation ] end VM_Operation (0x000000000203ddf8): HandshakeAllThreads, mode: no safepoint, requested by thread 0x000000000286d800
[11.783s][7248][debug][protectiondomain ] Checking package access
[11.783s][7248][debug][protectiondomain ] class loader: a 'jdk/internal/loader/ClassLoaders$AppClassLoader'{0x00000000c0058628} protection domain: a 'java/security/ProtectionDomain'{0x00000000c058b948} loading: 'java/awt/Graphics'
[11.783s][7248][debug][protectiondomain ] granted
[11.783s][7248][debug][class,resolve ] sun.launcher.LauncherHelper java.awt.Graphics LauncherHelper.java:816 (reflection)
[11.783s][7248][debug][class,resolve ] jdk.internal.reflect.Reflection [Ljava.lang.reflect.Method; Reflection.java:300
[11.783s][7248][debug][class,preorder ] java.lang.PublicMethods$MethodList source: C:\Users\example\AppData\Local\example\stable\jdk\lib\modules
...然后是第三个:
...
[14.578s][7248][debug][class,preorder ] java.lang.InheritableThreadLocal source: C:\Users\example\AppData\Local\example\stable\jdk\lib\modules
[14.578s][7248][info ][class,load ] java.lang.InheritableThreadLocal source: jrt:/java.base
[14.578s][7248][debug][class,load ] klass: 0x0000000100124740 super: 0x0000000100021a18 loader: [loader data: 0x0000000002882bd0 of 'bootstrap'] bytes: 1338 checksum: 8013ed55
[14.578s][7248][debug][class,resolve ] java.lang.InheritableThreadLocal java.lang.ThreadLocal (super)
[14.578s][7248][debug][jit,compilation ] 699 3 java.lang.ThreadLocal::get (38 bytes) made not entrant
[14.578s][7248][debug][vmthread ] Adding VM operation: HandshakeAllThreads
[14.578s][6708][debug][vmthread ] Evaluating non-safepoint VM operation: HandshakeAllThreads
[14.578s][6708][debug][vmoperation ] begin VM_Operation (0x000000000203d228): HandshakeAllThreads, mode: no safepoint, requested by thread 0x000000000286d800
[14.578s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026b0800, is_vm_thread: true, completed in 1600 ns
[14.578s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026bb800, is_vm_thread: true, completed in 900 ns
[14.578s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026ef800, is_vm_thread: true, completed in 100 ns
[14.578s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026f0800, is_vm_thread: true, completed in 100 ns
[14.578s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026f1800, is_vm_thread: true, completed in 100 ns
[14.578s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x00000000026f4800, is_vm_thread: true, completed in 0 ns
[14.578s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x0000000002768800, is_vm_thread: true, completed in 0 ns
[14.578s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x000000000276e000, is_vm_thread: true, completed in 0 ns
[14.578s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x0000000017268800, is_vm_thread: true, completed in 0 ns
[14.579s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x000000001727e800, is_vm_thread: true, completed in 900 ns
# no log until 21.455s
[21.455s][6708][debug][handshake,task ] Operation: Deoptimize for thread 0x000000000286d800, is_vm_thread: true, completed in 12100 ns
[21.455s][6708][info ][handshake ] Handshake "Deoptimize", Targeted threads: 11, Executed by targeted threads: 0, Total completion time: 6876829000 ns
[21.455s][6708][debug][vmoperation ] end VM_Operation (0x000000000203d228): HandshakeAllThreads, mode: no safepoint, requested by thread 0x000000000286d800
[21.455s][7248][debug][class,resolve ] sun.security.jca.Providers java.lang.InheritableThreadLocal Providers.java:39
[21.455s][7248][info ][class,init ] 1251 Initializing 'java/lang/InheritableThreadLocal'(no method) (0x0000000100124740)
[21.455s][7248][debug][class,resolve ] java.lang.InheritableThreadLocal java.lang.ThreadLocal InheritableThreadLocal.java:57
[21.456s][7248][debug][class,preorder ] sun.security.jca.ProviderList source: C:\Users\example\AppData\Local\example\stable\jdk\lib\modules
[21.456s][7248][info ][class,load ] sun.security.jca.ProviderList source: jrt:/java.base
[21.456s][7248][debug][class,load ] klass: 0x00000001001249a8 super: 0x0000000100001080 loader: [loader data: 0x0000000002882bd0 of 'bootstrap'] bytes: 11522 checksum: bdc239d2
[21.456s][7248][debug][class,resolve ] sun.security.jca.ProviderList java.lang.Object (super)
...以下两行似乎很有趣:
[11.783s][6708][info ][handshake ] Handshake "Deoptimize", Targeted threads: 11, Executed by targeted threads: 0, Total completion time: 5536442500 ns
[21.455s][6708][info ][handshake ] Handshake "Deoptimize", Targeted threads: 11, Executed by targeted threads: 0, Total completion time: 6876829000 ns这些握手用了5.5秒和6.8秒是正常的吗?
在使用以下命令运行的update4j演示应用程序 (与我们的应用程序完全无关)中,我经历了同样的减速(以及类似的日志):
Z:\swing>\jdk-14\bin\java -Xlog:all=debug:file=app.txt:uptime,tid,level,tags:filecount=50 \
-jar update4j-1.4.5.jar --remote http://docs.update4j.org/demo/setup.xml在单CPU Windows 10安装程序上,我应该寻找什么使我们的应用再次更快?我可以通过更改应用程序中的某些内容或添加JVM参数来解决这个问题吗?
这是JDK的错误吗,我应该报告一下吗?
更新2020-04-25:
据我所见,日志文件还包含GC日志。这些是第一个GC日志:
$ cat app.txt.00 | grep "\[gc"
[0.016s][7248][debug][gc,heap ] Minimum heap 8388608 Initial heap 60817408 Maximum heap 1073741824
[0.017s][7248][info ][gc,heap,coops ] Heap address: 0x00000000c0000000, size: 1024 MB, Compressed Oops mode: 32-bit
[0.018s][7248][info ][gc ] Using Serial
[22.863s][6708][info ][gc,start ] GC(0) Pause Young (Allocation Failure)
[22.863s][6708][debug][gc,heap ] GC(0) Heap before GC invocations=0 (full 0): def new generation total 17856K, used 15936K [0x00000000c0000000, 0x00000000c1350000, 0x00000000d5550000)
...不幸的是,它似乎不相关,因为它开始后,第三次暂停。
更新2020-04-26:
使用OpenJDK 14,应用程序在我的(单CPU) VirtualBox计算机(运行在i7-6600UCPU上)中使用100%的CPU。虚拟机有3,5 GB内存。根据任务管理器,40%+是免费的,磁盘活动是0% (我猜这意味着没有交换)。将另一个CPU添加到虚拟机(并为物理机器启用超线程)再次使应用程序足够快。我只是想知道,在JDK开发过程中,是否有意在(罕见的)单CPU机器上丢失性能,从而使JVM在多核/超线程CPU上更快?
发布于 2020-05-02 23:06:31
TL;DR:这是一个OpenJDK回归字段为JDK-8244340,已经在JDK15Build24 (2020/5/20)中得到了修正。
我没有例外,但我可以用一个简单的hello世界来再现这个问题:
public class Main {
public static void main(String[] args) {
System.out.println("Hello world");
}
}我使用了这两个批处理文件:
main-1cpu.bat,它将java进程限制为一个CPU:
c:\windows\system32\cmd.exe /C start /affinity 1 \
\jdk-14\bin\java \
-Xlog:all=trace:file=app-1cpu.txt:uptime,tid,level,tags:filecount=50 \
Mainmain-full.bat,java进程可以同时使用以下两个CPU:
c:\windows\system32\cmd.exe /C start /affinity FF \
\jdk-14\bin\java \
-Xlog:all=trace:file=app-full.txt:uptime,tid,level,tags:filecount=50 \
Main(区别是日志文件的affinity值和名称。我包装它是为了更容易阅读,但是用\包装可能无法在Windows上工作。)
在Windows 10 x64上使用VirtualBox (有两个CPU)进行一些测量:
PS Z:\main> Measure-Command { .\main-1cpu.bat }
...
TotalSeconds : 7.0203455
...
PS Z:\main> Measure-Command { .\main-full.bat }
...
TotalSeconds : 1.5751352
...
PS Z:\main> Measure-Command { .\main-full.bat }
...
TotalSeconds : 1.5585384
...
PS Z:\main> Measure-Command { .\main-1cpu.bat }
...
TotalSeconds : 23.6482685
...生成的示踪包含类似的停顿,您可以在问题中看到。
运行不带示踪剂的Main更快,但单CPU和双CPU版本之间的区别仍然存在:~4-7秒和~400 ms。
我已经派了这一发现指向hotspot@openjdk邮件列表。和devs去那里证实了这是JDK能够更好地处理的事情。。您也可以在线程中找到假定的修正。关于hotspot上回归的另一个线程-运行时-dev@。修复的JIRA问题:JDK-8244340
发布于 2020-04-26 06:45:16
根据我的经验,JDK的性能问题主要与以下之一有关:
如果您只使用自OpenJDK11以来的默认JVM配置,则可能应该将一些比较突出的选项设置为固定值,如GC、堆大小等。
也许一些图形分析工具可以帮助跟踪您的问题。比如回溯,AppDynamics或FlightRecorder等等。这些文件提供了关于给定时间堆、GC周期、RAM、线程、CPU负载等总体状态的更多概述,而日志文件无法提供这些信息。
我是否正确地理解您的应用程序在运行的第一秒钟内(在OpenJDK11下)就将大约30710行写入日志?为什么“只”在第一秒钟就在OpenJDK14下写了7k行?对我来说,对于刚刚在不同JVM上启动的应用程序来说,这似乎是一个巨大的不同。例如,是否确定日志中没有大量的异常堆栈跟踪?
其他的数字有时甚至更高,所以可能减速与异常日志记录有关吗?甚至交换,如果内存变低?
实际上,我认为,如果应用程序没有向日志中写入任何内容,这是一个平稳运行的迹象,没有问题(除非此时完全冻结)。从12秒钟到22秒发生的事情(在这里的OpenJDK14案例中)是什么会让我更担心.记录在案的线路穿过屋顶..。为什么?
后来测井值降到了1-2k线的最低点.原因是什么??(好吧,也许是GC在第22位就开始了,然后用一张表格解决了一些事情.)
另一件事可能是你关于“单CPU”机器的陈述。这是否意味着“单一核心”(Idk,也许您的软件是针对遗留硬件或其他什么的)?
而“单CPU”VM正在这些机器上运行吗?但我想,这些假设我错了,因为现在几乎所有的CPU都是多核的.但我可能会研究多线程问题(死锁)。
发布于 2020-04-26 20:24:46
因为它使用的是100%的CPU“大部分时间”,它需要花费10倍的时间(!)使用Java 14,意味着您在Java 14中浪费了90%的CPU。
耗尽堆空间可以做到这一点,因为您在GC上花费了大量的时间,但您似乎已经排除了这一点。
我注意到您正在调整偏置锁定选项,这会产生很大的不同。这告诉我,也许您的程序在多个线程中做了很多并发工作。您的程序可能有一个并发错误,出现在Java 14中,而不是Java 10中,这也解释了为什么添加另一个CPU会使它的速度提高一倍以上。
并发错误通常只在您运气不好时才会出现,并且触发的可能是任何东西,比如对hashmap组织的更改等等。
首先,如果可行的话,检查任何可能繁忙的循环--等待而不是睡觉。
然后,在抽样模式下运行一个分析器(jvisualvm将这样做),并寻找所需时间占总时间的比例远大于应有时间的方法。由于你的表现下降了10倍,任何问题都应该跳出来。
https://stackoverflow.com/questions/61375565
复制相似问题