我在同一代码行上看到了多个线程死锁的问题。我不能在本地或在任何测试中重现问题,但是生产中的线程转储已经非常清楚地显示了问题。
我不明白为什么线程会在下面的同步行上被阻塞,因为调用堆栈或任何其他线程中的对象上没有其他同步。有没有人知道发生了什么事,或者我怎样才能重现这个问题(目前在循环中尝试使用15个线程,同时通过队列处理2000个任务,但无法再现)。
在下面的线程转储中,我认为具有“锁定”状态的多线程可能是Java:id=8047816的一种表现,其中JStack报告线程处于错误状态。(我使用的是JDK版本: 1.7.0_51)
干杯!
以下是线程转储中的线程视图.
"xxx>Job Read-3" daemon prio=10 tid=0x00002aca001a6800 nid=0x6a3b waiting for monitor entry [0x0000000052ec4000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- locked <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- <0x00002aaf5f9c2680> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"xxx>Job Read-2" daemon prio=10 tid=0x00002aca001a5000 nid=0x6a3a waiting for monitor entry [0x0000000052d83000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- locked <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- <0x00002aaf5f9ed518> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"xxx>Job Read-1" daemon prio=10 tid=0x00002aca00183000 nid=0x6a39 waiting for monitor entry [0x0000000052c42000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- waiting to lock <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- <0x00002aaf5f9ecde8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"xxx>Job Read-0" daemon prio=10 tid=0x0000000006a83000 nid=0x6a36 waiting for monitor entry [0x000000005287f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- waiting to lock <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)这是提取的Java代码,它显示了错误的位置.
public class Deadlock {
final Deque<Object> delegate = new ArrayDeque<>();
final long maxSize = Long.MAX_VALUE;
private final AtomicLong totalExec = new AtomicLong();
private final Map<Object, AtomicLong> totals = new HashMap<>();
private final Map<Object, Deque<Long>> execTimes = new HashMap<>();
public void trim() {
//Possible optimization is evicting in chunks, segmenting by arrival time
while (this.totalExec.longValue() > this.maxSize) {
final Object t = this.delegate.peek();
final Deque<Long> execTime = this.execTimes.get(t);
final Long exec = execTime.peek();
if (exec != null && this.totalExec.longValue() - exec > this.maxSize) {
//If Job Started Inside of Window, remove and re-loop
remove();
}
else {
//Otherwise exit the loop
break;
}
}
}
public Object remove() {
Object removed;
synchronized (this.delegate) { //4 Threads deadlocking on this line !
removed = this.delegate.pollFirst();
}
if (removed != null) {
itemRemoved(removed);
}
return removed;
}
public void itemRemoved(final Object t) {
//Decrement Total & Queue
final AtomicLong catTotal = this.totals.get(t);
if (catTotal != null) {
if (!this.execTimes.get(t).isEmpty()) {
final Long exec = this.execTimes.get(t).pollFirst();
if (exec != null) {
catTotal.addAndGet(-exec);
this.totalExec.addAndGet(-exec);
}
}
}
}
}发布于 2015-02-12 14:43:51
感谢这里的回复,很明显问题是多个收集器的非线程安全使用。
为了解决这个问题,我使trim方法同步化,并将HashMap的使用替换为ConcurrentHashMap,ArrayDeque的使用替换为LinkedBlockingDeque (并发集合FTW!)
另一个计划中的增强是将两个单独的映射的用法更改为一个包含自定义对象的Map,这样可以保持操作(在itemRemoved中)是原子的。
发布于 2015-02-09 15:04:26
来自HashMap
注意此实现不是同步的.如果多个线程同时访问一个散列映射,并且至少有一个线程在结构上修改映射,则必须在外部同步。
(强调他们的)
您正在以不同步的方式读取和写入Map。
我认为没有理由假设您的代码是线程安全的。
我建议您在trim中有一个无限循环,这是由于缺乏线程安全性造成的。
进入同步块相对较慢,因此线程转储很可能总是显示至少几个等待获得锁的线程。
发布于 2015-02-09 15:04:43
在等待pollFirst时,您的第一个线程正在保存锁。
"xxx>Job Read-3" daemon prio=10 tid=0x00002aca001a6800 nid=0x6a3b waiting for monitor entry [0x0000000052ec4000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- locked <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)其他线程正在等待获得锁。您需要提供整个线程转储,以确定哪个线程持有0x0000000052ec4000上的锁,这就是阻止pollFirst调用返回的原因。
https://stackoverflow.com/questions/28412556
复制相似问题