首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Infinispan9.4.16,JBoss EAP7.3与复制缓存的锁争用2节点线程为TIMED_WAITING (驻留)

Infinispan9.4.16,JBoss EAP7.3与复制缓存的锁争用2节点线程为TIMED_WAITING (驻留)
EN

Stack Overflow用户
提问于 2021-04-21 02:16:20
回答 1查看 207关注 0票数 0

我有一个应用程序,它当前依赖于一个无限的复制缓存来跨所有节点共享一个工作队列。队列是非常标准的,头部、尾部和大小指针持久化在infinispan映射中。

我们已经从Infinispan7.2.5升级到9.4.16,注意到锁性能比以前差很多。当这两个节点同时尝试初始化队列时,我设法从它们那里获得了线程转储。在Infinispan7.2.5中,锁定和同步性能非常好,没有任何问题。现在我们看到锁定超时和更多的失败。

来自线程转储的节点#1部分堆栈跟踪2021-04-20 13:45:13:

代码语言:javascript
复制
"default task-2" #600 prio=5 os_prio=0 tid=0x000000000c559000 nid=0x1f8a waiting on condition [0x00007f4df3f72000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006e1f4fec0> (a java.util.concurrent.CompletableFuture$Signaller)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
    at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:105)
    at org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.get(SimpleAsyncInvocationStage.java:38)
    at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:250)
    at org.infinispan.cache.impl.CacheImpl.lock(CacheImpl.java:1077)
    at org.infinispan.cache.impl.CacheImpl.lock(CacheImpl.java:1057)
    at org.infinispan.cache.impl.AbstractDelegatingAdvancedCache.lock(AbstractDelegatingAdvancedCache.java:286)
    at org.infinispan.cache.impl.EncoderCache.lock(EncoderCache.java:318)
    at com.siperian.mrm.match.InfinispanQueue.initialize(InfinispanQueue.java:88)

来自线程转储的Node#2部分堆栈跟踪: 2021-04-20 13:45:04:

代码语言:javascript
复制
"default task-2" #684 prio=5 os_prio=0 tid=0x0000000011f26000 nid=0x3c60 waiting on condition [0x00007f55107e4000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000746bd36d8> (a java.util.concurrent.CompletableFuture$Signaller)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
    at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:105)
    at org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.get(SimpleAsyncInvocationStage.java:38)
    at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:250)
    at org.infinispan.cache.impl.CacheImpl.lock(CacheImpl.java:1077)
    at org.infinispan.cache.impl.CacheImpl.lock(CacheImpl.java:1057)
    at org.infinispan.cache.impl.AbstractDelegatingAdvancedCache.lock(AbstractDelegatingAdvancedCache.java:286)
    at org.infinispan.cache.impl.EncoderCache.lock(EncoderCache.java:318)
    at com.siperian.mrm.match.InfinispanQueue.initialize(InfinispanQueue.java:88)

在运行Node #1的机器的控制台上弹出的客户端错误:

代码语言:javascript
复制
2021-04-20 13:45:49,069 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (jgroups-15,infinispan-cleanse-cluster_192.168.0.24_cmx_system105,N1618938080334-63633(machine-id=M1618938080334)) ISPN000136: Error executing command LockControlCommand on Cache 'orclmdm-MDM_SAMPLE105/FUZZY_MATCH', writing keys []: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 60 seconds for key QUEUE_TAIL_C_PARTY and requestor GlobalTx:N1618938080334-63633(machine-id=M1618938080334):429. Lock is held by GlobalTx:N1618938062946-60114(machine-id=M1618938062946):420
    at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.get(DefaultLockManager.java:288)
    at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.lock(DefaultLockManager.java:261)
    at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$CompositeLockPromise.lock(DefaultLockManager.java:348)
    at org.infinispan.interceptors.locking.PessimisticLockingInterceptor.localLockCommandWork(PessimisticLockingInterceptor.java:208)
    at org.infinispan.interceptors.locking.PessimisticLockingInterceptor.lambda$new$0(PessimisticLockingInterceptor.java:46)
    at org.infinispan.interceptors.InvocationSuccessFunction.apply(InvocationSuccessFunction.java:25)
    at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.invokeQueuedHandlers(QueueAsyncInvocationStage.java:118)
    at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:81)
    at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:30)
    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
    at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
    at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
    at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:67)
    at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:102)
    at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
    at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1369)
    at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1272)
    at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:126)
    at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1417)
    at org.jgroups.JChannel.up(JChannel.java:816)
    at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:900)
    at org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:128)
    at org.jgroups.protocols.RSVP.up(RSVP.java:163)
    at org.jgroups.protocols.FRAG2.up(FRAG2.java:177)
    at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
    at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
    at org.jgroups.protocols.pbcast.GMS.up(GMS.java:872)
    at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:240)
    at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1008)
    at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:734)
    at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:389)
    at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:590)
    at org.jgroups.protocols.BARRIER.up(BARRIER.java:171)
    at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:131)
    at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:203)
    at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:253)
    at org.jgroups.protocols.MERGE3.up(MERGE3.java:280)
    at org.jgroups.protocols.Discovery.up(Discovery.java:295)
    at org.jgroups.protocols.TP.passMessageUp(TP.java:1250)
    at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

Infinispan:

代码语言:javascript
复制
<?xml version="1.0" encoding="UTF-8"?>
<infinispan
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:infinispan:config:9.4 http://www.infinispan.org/schemas/infinispan-config-9.4.xsd"
        xmlns="urn:infinispan:config:9.4">    

    <jgroups>
        <stack-file name="mdmudp" path="$cmx.home$/jgroups-udp.xml" />
        <stack-file name="mdmtcp" path="$cmx.home$/jgroups-tcp.xml" />
    </jgroups>

    <cache-container name="MDMCacheManager" statistics="true"
        shutdown-hook="DEFAULT">
        <transport stack="mdmudp" cluster="infinispan-cluster"
            node-name="$node$" machine="$machine$" />

        <jmx domain="org.infinispan.mdm.hub"/>  

        <replicated-cache name="FUZZY_MATCH" statistics="true" unreliable-return-values="false">
            <locking isolation="READ_COMMITTED" acquire-timeout="60000"
                concurrency-level="5000" striping="false" />
            <transaction
                transaction-manager-lookup="org.infinispan.transaction.lookup.GenericTransactionManagerLookup"
                stop-timeout="30000" auto-commit="true" locking="PESSIMISTIC"
                mode="NON_XA" notifications="true" />
        </replicated-cache>

    </cache-container>
</infinispan>

我们在默认情况下使用udp多播,以下是udp配置:

代码语言:javascript
复制
<!--
  Default stack using IP multicasting. It is similar to the "udp"
  stack in stacks.xml, but doesn't use streaming state transfer and flushing
  author: Bela Ban
-->

<config xmlns="urn:org:jgroups"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
    <UDP
         mcast_port="${jgroups.udp.mcast_port:46688}"
         ip_ttl="4"
         tos="8"
         ucast_recv_buf_size="5M"
         ucast_send_buf_size="5M"
         mcast_recv_buf_size="5M"
         mcast_send_buf_size="5M"
         max_bundle_size="64K"
         enable_diagnostics="true"
         thread_naming_pattern="cl"

         thread_pool.enabled="true"
         thread_pool.min_threads="2"
         thread_pool.max_threads="8"
         thread_pool.keep_alive_time="5000"/>

    <PING />
    <MERGE3 max_interval="30000"
            min_interval="10000"/>
    <FD_SOCK/>
    <FD_ALL/>
    <VERIFY_SUSPECT timeout="1500"  />
    <BARRIER />
    <pbcast.NAKACK2 xmit_interval="500"
                    xmit_table_num_rows="100"
                    xmit_table_msgs_per_row="2000"
                    xmit_table_max_compaction_time="30000"
                    use_mcast_xmit="false"
                    discard_delivered_msgs="true"/>
    <UNICAST3 xmit_interval="500"
              xmit_table_num_rows="100"
              xmit_table_msgs_per_row="2000"
              xmit_table_max_compaction_time="60000"
              conn_expiry_timeout="0"/>
    <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
                   max_bytes="4M"/>
    <pbcast.GMS print_local_addr="true" join_timeout="2000"
                view_bundling="true"/>
    <UFC max_credits="2M"
         min_threshold="0.4"/>
    <MFC max_credits="2M"
         min_threshold="0.4"/>
    <FRAG2 frag_size="60K"  />
    <RSVP resend_interval="2000" timeout="10000"/>
    <pbcast.STATE_TRANSFER />
    <!-- pbcast.FLUSH  /-->
</config>

任何关于配置的想法都会很棒。发生的情况是两个节点都超时,队列不能正确初始化(空键)。提前谢谢。顺便说一句,在每个节点上总共有24个线程(总共48个)可以访问共享队列。

EN

回答 1

Stack Overflow用户

发布于 2021-04-24 11:30:51

我做了一些研究,发现针对复制缓存的锁定是先针对远程节点进行的,然后才尝试在本地锁定密钥。我相信,如果node1在node2试图锁定node1的同时尝试锁定node2,则可能会出现死锁。因此,我更改了所有缓存以使用Flag.FAIL_SILENTLY和Flag.ZERO_LOCK_ACQUISITION_TIMEOUT,并在添加或删除队列中的元素时在客户端添加了额外的重试逻辑。从最初的测试来看,现在情况看起来好多了。

我很好奇Infinispan7和更高版本之间有什么变化,使得悲观锁定在新版本中表现得更差。旧的客户端代码(没有标志或重试逻辑)在以前的相同测试条件下工作得很好。我怀疑与使用futures和forkJoinPool相关的更改,因为我在其他项目中使用它们时遇到了问题,不得不回到使用标准执行器的旧方法。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/67184274

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档