文章/答案/技术大牛

发布

社区首页 >问答首页 >dispatch_atomic_maximally_synchronizing_barrier()；是什么意思？

问dispatch_atomic_maximally_synchronizing_barrier()；是什么意思？
EN

Stack Overflow用户

提问于 2014-12-19 08:40:36

回答 1查看 207关注 0票数 0

最近，我从mikeash那里读到了博客，它告诉我dispatch_once的详细实现。我还在宏锻中获得了它的源代码。

除了这一行之外，我理解大部分代码：

dispatch_atomic_maximally_synchronizing_barrier();

它是一个宏，定义如下：

#define dispatch_atomic_maximally_synchronizing_barrier() \
    do { unsigned long _clbr; __asm__ __volatile__( \
    "cpuid" \
    : "=a" (_clbr) : "0" (0) : "rbx", "rcx", "rdx", "cc", "memory" \
    ); } while(0)

我知道它被用来确保“击败同侪CPU的投机阅读”，但我不知道cpuid和后面的单词。我对汇编语言了解甚少。

有人能为我解释一下吗？非常感谢。

ios

assembly

grand-central-dispatch

回答 1

Stack Overflow用户

回答已采纳

发布于 2014-12-19 21:29:22

Lib分派源代码很好地解释了这一点。

http://opensource.apple.com/source/libdispatch/libdispatch-442.1.4/src/shims/atomic.h

// see comment in dispatch_once.c
#define dispatch_atomic_maximally_synchronizing_barrier() \

http://opensource.apple.com/source/libdispatch/libdispatch-442.1.4/src/once.c

// The next barrier must be long and strong.
//
// The scenario: SMP systems with weakly ordered memory models
// and aggressive out-of-order instruction execution.
//
// The problem:
//
// The dispatch_once*() wrapper macro causes the callee's
// instruction stream to look like this (pseudo-RISC):
//
//      load r5, pred-addr
//      cmpi r5, -1
//      beq  1f
//      call dispatch_once*()
//      1f:
//      load r6, data-addr
//
// May be re-ordered like so:
//
//      load r6, data-addr
//      load r5, pred-addr
//      cmpi r5, -1
//      beq  1f
//      call dispatch_once*()
//      1f:
//
// Normally, a barrier on the read side is used to workaround
// the weakly ordered memory model. But barriers are expensive
// and we only need to synchronize once! After func(ctxt)
// completes, the predicate will be marked as "done" and the
// branch predictor will correctly skip the call to
// dispatch_once*().
//
// A far faster alternative solution: Defeat the speculative
// read-ahead of peer CPUs.
//
// Modern architectures will throw away speculative results
// once a branch mis-prediction occurs. Therefore, if we can
// ensure that the predicate is not marked as being complete
// until long after the last store by func(ctxt), then we have
// defeated the read-ahead of peer CPUs.
//
// In other words, the last "store" by func(ctxt) must complete
// and then N cycles must elapse before ~0l is stored to *val.
// The value of N is whatever is sufficient to defeat the
// read-ahead mechanism of peer CPUs.
//
// On some CPUs, the most fully synchronizing instruction might
// need to be issued.

dispatch_atomic_maximally_synchronizing_barrier();

对于x86_64和i386体系结构，正如@Michael所提到的，它使用cpuid指令来刷新指令管道。cpuid正在序列化指令，以防止内存重新排序。和__sync_synchronize的另一个架构。

https://gcc.gnu.org/onlinedocs/gcc-4.6.2/gcc/Atomic-Builtins.html

__sync_synchronize (...)
This builtin issues a full memory barrier.

这些建筑被认为是完全的屏障。也就是说，无论向前还是向后，都不会在操作中移动内存操作数。此外，将根据需要发出指令，以防止处理器在整个操作中推测负载并在操作后排队存储。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/27562334

复制

相似问题

问dispatch_atomic_maximally_synchronizing_barrier()；是什么意思？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问dispatch_atomic_maximally_synchronizing_barrier()；是什么意思？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问dispatch_atomic_maximally_synchronizing_barrier()；是什么意思？
EN