文章/答案/技术大牛

发布

社区首页 >问答首页 >极慢地找到超过阈值的值

问极慢地找到超过阈值的值
EN

Stack Overflow用户

提问于 2014-09-22 20:08:46

回答 1查看 193关注 0票数 0

我有代码需要找出激光发射的时间点。该激光器在数据集上用直流500以上的直流表示，每次以3台激光器组成，速度相当快，但不是完全确定的时间间隔。

我现在使用的代码：

//std::vector<short>& laserData; this one comes from the function call, this is the definition
int count = 0;

for(unsigned int index = 0; index < laserData.size(); ++index) {
    if(laserData.at(index) > 500) {
        times.push_back(index);
        count = (count + 1)%3;
        if(count == 0) {
            int dif1 = times.at(times.size()-1) - times.at(times.size()-2);
            int dif2 = times.at(times.size()-1) - times.at(times.size()-3);
            if(dif1 > 60000 || dif2 > 60000) {
                times.erase(times.begin() + times.size() - 2);
                times.erase(times.begin() + times.size() - 2);
                count = 1;
            }
        }

        switch(count) {
            case 0: index += 90000;
            default: index += 2000;
        }
    }
}

我不能完全确定所有的3种激光脉冲都会发生，如果它们不发生，那么所有的1或2激光脉冲都需要被移除。

数据集长为130,000,000个样本，我总共得到了3300个激光脉冲，所以工作得很好，只是有点慢。解析这个向量需要45秒，我想知道是否有更快的方法。

vector

c++

回答 1

Stack Overflow用户

回答已采纳

发布于 2014-09-22 22:08:11

首先:除非您打算让switch语句失败，否则添加一个中断：

switch(count)
{
    case 0: 
        index += 90000;
        break;
    default: 
        index += 2000;
}

好的。现在，我们有一个潜在的错误，我们可以考虑加快您的代码。

要做的第一件事是尽可能地消除向量的大小。你说总共有3300个激光脉冲。让我们在此基础上增加10%的误差，并预先调整脉冲矢量的大小：

times.reserve(3600);

现在，不需要多次调整矢量的大小。如果有更多的，我们应该只有向量真实一次。

接下来，我们希望摆脱times.erase()函数调用。

要做到这一点，我们分别缓存三个最近的值，并且只有在它们被验证后才将它们推入向量：

const int pulse_interval =  2000;
const int burst_interval = 90000;
int cache[3];
times.reserve(3600);
for(unsigned int index = 0; index < laserData.size(); ++index) 
{
    if(laserData[index] > 500) 
    {
        //this next if block acts as our guard clause
        if (count > 0)
        {
            diff = index - cache[count-1];
            if (diff > 60000)
            {
                count = 1;
                cache[0]=index;
                index+= pulse_interval;
                continue;
                // if gap between pulses is too big reset and start again, setting most recent value to first of next sequence.
            }
        }
        //now we actually store data in the cache and if needed add to the vector. No invalid data (not part of a three burst set) should reach here due to guard
        cache[count]=index;
        if (count == 2)
        {
            for (int i=0; i<3; i++)
                {times.push_back(cache[i]);}
            count = 0;
            index += burst_interval;
        }
        else
        {
            count++;
            index += pulse_interval;
        }
        //updating count at the end is easier to follow
        //goes to 0 after 3rd pulse detected
    }
}

这将删除对无效数据的向量访问，并应尽可能加快代码的速度。

编辑:添加索引跳过参数。如果我的逻辑不对，请告诉我。在这种情况下，不需要切换，因为从算法中可以将逻辑封装到现有逻辑中。

如果您不能打开优化，那么您可以尝试展开push_back循环。缓存数组可以减少为两个单元格，并且可以将index的存储移到else (对于第三个值仅为push_back(index); )。

这将消除循环开销，并在每次发现完整突发时分配一个任务。您的编译器将正常处理这一问题。

如果还慢的话。那你就需要侧写。确保您的index跳过的大小是正确的(太小意味着搜索太多，但是太大，可能会丢失有效的数据)

你也可以像一个评论者建议的那样并行地这样做。您可以这样做，将搜索空间分成多个部分，并为每个部分生成一个线程。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/25982365

复制

相似问题

问极慢地找到超过阈值的值
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问极慢地找到超过阈值的值EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问极慢地找到超过阈值的值
EN