文章/答案/技术大牛

发布

社区首页 >问答首页 >迅速转换[UInt32] -> [UInt8] -> [[UInt8]]

问迅速转换[UInt32] -> [UInt8] -> [[UInt8]]
EN

Stack Overflow用户

提问于 2016-10-28 15:46:20

回答 1查看 914关注 0票数 1

我正试图加快当前函数的实现速度，该函数将UInt32转换为UInt8，然后将其拆分为[UInt8]，每个索引中有6个数组。

我的实施：

extension Array {
func splitBy(subSize: Int) -> [[Element]] {
    return 0.stride(to: self.count, by: subSize).map { startIndex in
        let endIndex = startIndex.advancedBy(subSize, limit: self.count)
        return Array(self[startIndex ..< endIndex])
    }
  }
}



func convertWordToBytes(fullW : [UInt32]) -> [[UInt8]] {
    var combined8 = [UInt8]()

    //Convert 17 [UInt32] to 68 [UInt8]
    for i in 0...16{
        _ = 24.stride(through: 0, by: -8).map {
            combined8.append(UInt8(truncatingBitPattern: fullW[i] >> UInt32($0)))
        }
    }

    //Split [UInt8] to [[UInt8]] with 6 values at each index.
    let combined48 = combined8.splitBy(6) 

    return combined48
}

这个函数将在我的程序中迭代数百万次，它的速度是一个巨大的负担。

有人有什么想法吗？谢谢

swift

cryptography

uint32

uint8array

arrays

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-10-30 03:50:00

如果您分析(Cmd + I)您的代码，您将看到大部分时间是在各种“复制到缓冲区”函数上。当您将一个新元素附加到数组中，但它已经用完了其初始分配的空间时，就会发生这种情况，因此必须将其移动到堆中具有更多内存的位置。教训的道德性:堆分配缓慢，但数组不可避免。尽量少做一次。

试试这个：

func convertWordToBytes2(fullW: [UInt32]) -> [[UInt8]] {
    let subSize = 6

    // We allocate the array only once per run since allocation is so slow
    // There will only be assignment to it after
    var combined48 = [UInt8](count: fullW.count * 4, repeatedValue: 0).splitBy(subSize)

    var row = 0
    var col = 0

    for i in 0...16 {
        for j in 24.stride(through: 0, by: -8) {
            let value = UInt8(truncatingBitPattern: fullW[i] >> UInt32(j))
            combined48[row][col] = value

            col += 1
            if col >= subSize {
                row += 1
                col = 0
            }
        }
    }

    return combined48
}

基准代码：

let testCases = (0..<1_000_000).map { _ in
    (0..<17).map { _ in arc4random() }
}

testCases.forEach {
    convertWordToBytes($0)
    convertWordToBytes2($0)
}

结果(在我2012年的iMac)

Weight          Self Weight         Symbol Name
9.35 s   53.2%  412.00 ms           specialized convertWordToBytes([UInt32]) -> [[UInt8]]
3.28 s   18.6%  344.00 ms           specialized convertWordToBytes2([UInt32]) -> [[UInt8]]

通过消除多个分配，我们已经将运行时间减少了60%。但是每一个测试用例都是独立的，这使得它完全可以与当今的多核CPU并行处理。修正的循环.：

dispatch_apply(testCases.count, dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0)) { i in
    convertWordToBytes2(testCases[i])
}

..。当我用8个线程在四核i7上执行时，会减少大约1秒的墙壁时间：

Weight    Self Weight       Symbol Name
2.28 s    6.4%  0 s         _dispatch_worker_thread3  0x58467
2.24 s    6.3%  0 s         _dispatch_worker_thread3  0x58463
2.22 s    6.2%  0 s         _dispatch_worker_thread3  0x58464
2.21 s    6.2%  0 s         _dispatch_worker_thread3  0x58466
2.21 s    6.2%  0 s         _dispatch_worker_thread3  0x58465
2.21 s    6.2%  0 s         _dispatch_worker_thread3  0x58461
2.18 s    6.1%  0 s         _dispatch_worker_thread3  0x58462

节省的时间没有我所希望的那么多。显然，在访问堆内存时存在一些争用。对于更快的解决方案，您应该探索一种基于C的解决方案。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/40308690

复制

相似问题

问迅速转换[UInt32] -> [UInt8] -> [[UInt8]]
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问迅速转换[UInt32] -> [UInt8] -> [[UInt8]]EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问迅速转换[UInt32] -> [UInt8] -> [[UInt8]]
EN