首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >尽管设置了超时,Bull.js作业仍处于停滞状态

尽管设置了超时,Bull.js作业仍处于停滞状态
EN

Stack Overflow用户
提问于 2021-10-20 18:20:04
回答 1查看 763关注 0票数 2

我有一个公牛队列运行冗长的视频上传作业,这可能需要任何时间从<1分钟至多分钟。

作业在默认的30秒后停止运行,所以我将超时时间增加到几分钟,但这一点并没有得到遵守。如果我将超时设置为10 is,它将立即停止,因此它将考虑超时。

代码语言:javascript
复制
Job {
      opts: {
      attempts: 1,
      timeout: 600000,
      delay: 0,
      timestamp: 1634753060062,
      backoff: undefined
    }, 
    ...
}

尽管超时,我正在接收一个stalled事件,并且作业再次开始处理。

编辑:我认为“拖延”和超时是一样的,但很明显,对于被搁置的工作,公牛检查的频率有一个单独的超时时间。换句话说,真正的问题是为什么工作被认为是“停滞”,即使他们在忙着做上传。

EN

回答 1

Stack Overflow用户

发布于 2022-11-09 13:05:26

问题似乎是由于您正在运行的操作阻塞了事件循环而导致您的作业延迟。您可以将您的代码转换为非阻塞代码,并以这种方式解决问题。

尽管如此,在启动队列时,可以在队列设置中设置停滞的间隔检查(更多的是快速解决方案):

代码语言:javascript
复制
const queue = new Bull('queue', {
    port: 6379,
    host: 'localhost',
    db: 0,
    settings: {
      stalledInterval: 60 * 60 * 1000, // change default from 30 sec to 1 hour, set 0 for disabling the stalled interval
    },
  })

根据公牛的医生:

error

  • stalledInterval:超时:超时作业应该失败的毫秒数,
  • 超时检查停止作业的频率(使用0表示永不检查)

增加stalledInterval (或通过将其设置为0来禁用它)将删除确保事件循环正在运行的检查,从而强制系统忽略失速状态。

同样也适用于医生:

代码语言:javascript
复制
When a worker is processing a job it will keep the job "locked" so other workers can't process it.

It's important to understand how locking works to prevent your jobs from losing their lock - becoming _stalled_ -
and being restarted as a result. Locking is implemented internally by creating a lock for `lockDuration` on interval
`lockRenewTime` (which is usually half `lockDuration`). If `lockDuration` elapses before the lock can be renewed,
the job will be considered stalled and is automatically restarted; it will be __double processed__. This can happen when:
1. The Node process running your job processor unexpectedly terminates.
2. Your job processor was too CPU-intensive and stalled the Node event loop, and as a result, Bull couldn't renew the job lock (see [#488](https://github.com/OptimalBits/bull/issues/488) for how we might better detect this). You can fix this by breaking your job processor into smaller parts so that no single part can block the Node event loop. Alternatively, you can pass a larger value for the `lockDuration` setting (with the tradeoff being that it will take longer to recognize a real stalled job).

As such, you should always listen for the `stalled` event and log this to your error monitoring system, as this means your jobs are likely getting double-processed.

As a safeguard so problematic jobs won't get restarted indefinitely (e.g. if the job processor always crashes its Node process), jobs will be recovered from a stalled state a maximum of `maxStalledCount` times (default: `1`).
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69651175

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档