我有一个带有启动爬虫程序的POST端点的express服务器。当爬虫完成后,它会关闭整个服务器。我做错了什么吗?我怎样才能防止它的发生呢?
这个项目看起来像这样:
// server.js
const express = require('express')
const bodyParser = require('body-parser')
const startSearch = require('./crawler.js')
const app = express()
app.use(bodyParser.json())
app.post('/crawl', async (req, res) => {
const { foo, bar } = req.body
startSearch({ foo, bar })
res.end()
})
app.listen(PORT, () => console.log(`listening on port ${PORT}`))// crawler.js
const Apify = require('apify')
const startSearch = ({ foo, bar }) => {
Apify.main(async () => {
const sources = [{
url: 'https://path_to_website.com',
userData: { foo, bar }
}]
const requestList = await Apify.openRequestList(null, sources)
const crawler = new Apify.PuppeteerCrawler({
requestList,
handlePageFunction: async ({ request, page }) => {
// do things using puppeteer
}
}
})
await crawler.run()
})
}发布于 2020-01-14 03:16:12
只要避免使用Apify.main()即可。有关详情,请参阅How to use Apify on Google Cloud Functions
(我以为我正在发送答案,但它似乎只是一个评论)
https://stackoverflow.com/questions/59706520
复制相似问题