我有这部分网页,我想抓取href或innerText
<span class="hash-tag text-truncate"><a href="/url/blabla" target="_parent"><<test that i want to scrape>></a></span>这是我的代码:
const nodeChildren = await page.$$('.hash-tag', (uiElement) => {
uiElement.map((option) => option.innerText)
});
console.log(nodeChildren);结果是:
_page: Page {
eventsMap: Map(0) {},
emitter: [Object],
_closed: false,
_timeoutSettings: [TimeoutSettings],
_pageBindings: Map(0) {},
_javascriptEnabled: true,
_workers: Map(0) {},
_fileChooserInterceptors: Set(0) {},
_userDragInterceptionEnabled: false,
_handlerMap: [WeakMap],
_client: [CDPSession],我怎么能做到呢?
发布于 2021-11-27 14:12:50
尝试:
const textAndHrefs = await page.$$eval(".hash-tag a", els =>
els.map(el => ({text: el.innerText, href: el.href})))发布于 2021-11-26 13:39:09
尝试使用textContent而不是innerText,因为它在Puppeteer中有很多buggy。
const nodeChildren = await page.$$('.hash-tag', (uiElement) => {
uiElement.map((option) => option.textContent)
});
console.log(nodeChildren);https://stackoverflow.com/questions/70120988
复制相似问题