我正在更新一个直方图,它使用一个简单的整数数组来表示,具有16个柱状图,如下所示。
const int binSize = 4096;
int histogram[16];
unsigned short inData[1024]; // This is my input data. Short is 16 bits
for(int i = 0; i < 1024; ++i)
{
++histogram[inData[i] / binSize];
}我经常运行这个操作,所以这成为了一个瓶颈,因为这个循环不能被DSP并行化,因为多个bin不能同时更新。我如何优化这一点?
我在TI数字信号处理器C6000系列上运行这段代码。
发布于 2017-04-28 00:48:31
举个例子来说明注释的含义:
#include <array>
#include <algorithm>
#include <boost/range/adaptor/transformed.hpp>
using Histogram = std::array<int, 16>;
Histogram from_short(short num)
{
Histogram result;
result[num / 4096] = 1;
return result;
}
Histogram add(const Histogram & lhs, const Histogram & rhs)
{
Histogram result;
for (size_t i = 0; i < 16; ++i) { result[i] = lhs[i] + rhs[i]; }
return result;
}
auto singles = indata | boost::adaptors::transformed(from_short);
Histogram histogram = std::reduce(begin(singles), end(singles), Histogram{}, add);另一种选择:
std::sort(begin(indata), end(indata));
short * previous = begin(indata);
for (size_t i = 0; i < 15; ++i)
{
short * current = std::lower_bound(indata, 4096 * (i + 1));
histogram[i] = std::distance(previous, current);
previous = current;
}
histogram[16] = std::distance(previous, end(indata));https://stackoverflow.com/questions/43663096
复制相似问题