首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用Puppeteer在循环上刮起嵌套的span标签

用Puppeteer在循环上刮起嵌套的span标签
EN

Stack Overflow用户
提问于 2022-03-30 01:10:38
回答 1查看 183关注 0票数 1

我有一个嵌套的html标记,我试图刮掉文本并将其链接出来。但出于奇怪的原因它不起作用了。

我用这个表情符号标记了我想刮的线。一个是链接,另一个是文本。

代码语言:javascript
复制
<div class="q-box" role="list" style="box-sizing: border-box;">
    <div>
      <a class="q-box qu-display--block qu-cursor--pointer qu-hover--textDecoration--none Link___StyledBox-t2xg9c-0 KlcoI" target="_blank" href="https://www.quora.com/Can-Facebook-see-who-viewed-your-profile" style="box-sizing: border-box; border-radius: inherit;">
          <div class="q-box qu-hover--textDecoration--underline qu-tapHighlight--none qu-display--flex qu-alignItems--center" style="box-sizing: border-box; position: relative;">
             <div class="q-flex qu-alignItems--center qu-py--tiny qu-flex--auto qu-overflow--hidden" style="box-sizing: border-box; display: flex;">
                <div class="q-box qu-flex--auto qu-overflow--hidden" style="box-sizing: border-box;">
                   <div class="q-text qu-color--gray_dark" style="box-sizing: border-box;">
                      <div class="q-box qu-py--tiny" style="box-sizing: border-box;">
                         <span class="q-text qu-color--blue_dark" style="box-sizing: border-box;">
                            <div class="q-flex qu-flexDirection--row" style="box-sizing: border-box; display: flex;">
                               <div class="q-inline qu-flexWrap--wrap" style="box-sizing: border-box; display: inline; max-width: 100%;">
                                  <div class="q-text qu-truncateLines--2 puppeteer_test_question_title" style="box-sizing: border-box;">
                                    <span class="q-box qu-userSelect--text" style="box-sizing: border-box;">
                                    <span style="background: none;">Can Facebook see who viewed your profile?</span></span></div>
                               </div>
                            </div>
                         </span>
                      </div>
                   </div>
                </div>
             </div>
          </div>
       </a>
    </div>
    <div>
     <a class="q-box qu-display--block qu-cursor--pointer qu-hover--textDecoration--none Link___StyledBox-t2xg9c-0 KlcoI" target="_blank" href="https://onlinesocialmediasolution.quora.com/How-to-view-a-private-Facebook-profile" style="box-sizing: border-box; border-radius: inherit;">
          <div class="q-box qu-hover--textDecoration--underline qu-tapHighlight--none qu-display--flex qu-alignItems--center" style="box-sizing: border-box; position: relative;">
             <div class="q-flex qu-alignItems--center qu-py--tiny qu-flex--auto qu-overflow--hidden" style="box-sizing: border-box; display: flex;">
                <div class="q-box qu-flex--auto qu-overflow--hidden" style="box-sizing: border-box;">
                   <div class="q-text qu-color--gray_dark" style="box-sizing: border-box;">
                      <div class="q-box qu-py--tiny" style="box-sizing: border-box;">
                         <span class="q-text qu-color--blue_dark" style="box-sizing: border-box;">
                            <div class="q-flex qu-flexDirection--row" style="box-sizing: border-box; display: flex;">
                               <div class="q-inline qu-flexWrap--wrap" style="box-sizing: border-box; display: inline; max-width: 100%;">
                                  <div class="q-text qu-truncateLines--2 puppeteer_test_question_title" style="box-sizing: border-box;">
                                    <span class="q-box qu-userSelect--text" style="box-sizing: border-box;">
                                  <span style="background: none;">How do you view a private Facebook profile?</span></span></div>
                               </div>
                            </div>
                         </span>
                      </div>
                   </div>
                </div>
             </div>
          </div>
       </a>
    </div>
    <div>
     <a class="q-box qu-display--block qu-cursor--pointer qu-hover--textDecoration--none Link___StyledBox-t2xg9c-0 KlcoI" target="_blank" href="https://www.quora.com/How-can-you-tell-if-non-friends-have-viewed-your-Facebook-profile" style="box-sizing: border-box; border-radius: inherit;">
          <div class="q-box qu-hover--textDecoration--underline qu-tapHighlight--none qu-display--flex qu-alignItems--center" style="box-sizing: border-box; position: relative;">
             <div class="q-flex qu-alignItems--center qu-py--tiny qu-flex--auto qu-overflow--hidden" style="box-sizing: border-box; display: flex;">
                <div class="q-box qu-flex--auto qu-overflow--hidden" style="box-sizing: border-box;">
                   <div class="q-text qu-color--gray_dark" style="box-sizing: border-box;">
                      <div class="q-box qu-py--tiny" style="box-sizing: border-box;">
                         <span class="q-text qu-color--blue_dark" style="box-sizing: border-box;">
                            <div class="q-flex qu-flexDirection--row" style="box-sizing: border-box; display: flex;">
                               <div class="q-inline qu-flexWrap--wrap" style="box-sizing: border-box; display: inline; max-width: 100%;">
                                  <div class="q-text qu-truncateLines--2 puppeteer_test_question_title" style="box-sizing: border-box;">
                                    <span class="q-box qu-userSelect--text" style="box-sizing: border-box;">
                                        <span style="background: none;">How can you tell if non-friends have viewed your Facebook profile?</span></span></div>
                               </div>
                            </div>
                         </span>
                      </div>
                   </div>
                </div>
             </div>
          </div>
       </a>
    </div>
    <div>
     <a class="q-box qu-display--block qu-cursor--pointer qu-hover--textDecoration--none Link___StyledBox-t2xg9c-0 KlcoI" target="_blank" href="https://www.quora.com/Is-there-a-way-to-see-your-own-Facebook-profile-from-the-view-of-a-non-friend" style="box-sizing: border-box; border-radius: inherit;">
          <div class="q-box qu-hover--textDecoration--underline qu-tapHighlight--none qu-display--flex qu-alignItems--center" style="box-sizing: border-box; position: relative;">
             <div class="q-flex qu-alignItems--center qu-py--tiny qu-flex--auto qu-overflow--hidden" style="box-sizing: border-box; display: flex;">
                <div class="q-box qu-flex--auto qu-overflow--hidden" style="box-sizing: border-box;">
                   <div class="q-text qu-color--gray_dark" style="box-sizing: border-box;">
                      <div class="q-box qu-py--tiny" style="box-sizing: border-box;">
                         <span class="q-text qu-color--blue_dark" style="box-sizing: border-box;">
                            <div class="q-flex qu-flexDirection--row" style="box-sizing: border-box; display: flex;">
                               <div class="q-inline qu-flexWrap--wrap" style="box-sizing: border-box; display: inline; max-width: 100%;">
                                  <div class="q-text qu-truncateLines--2 puppeteer_test_question_title" style="box-sizing: border-box;">
                                    <span class="q-box qu-userSelect--text" style="box-sizing: border-box;">
                                   <span style="background: none;">Is there a way to see your own Facebook profile from the view of a non-friend?</span></span></div>
                               </div>
                            </div>
                         </span>
                      </div>
                   </div>
                </div>
             </div>
          </div>
       </a>
    </div>
 </div>

这是我到目前为止所做的Index.js文件代码。但它在所有标记的表情符号线上循环。也不起作用。

代码语言:javascript
复制
const browser = await puppeteer.launch({
    headless: false,
  });
  const page = await browser.newPage();
  await page.goto(url, { waitUntil: 'networkidle2' });
  try {
    // loop through the selector and get the data
    await page.waitForSelector(
      '#root > div.q-box > div > div > div:nth-child(4) > div > div > div:nth-child(2) > div > div'
    );
    const related = page.$eval(
      '#root > div.q-box > div > div > div:nth-child(4) > div > div > div:nth-child(2) > div > div > div.q-box.qu-mb--large > div > div:nth-child(2)',
      (el) => el.innerText
    );
    res.send(related);
  } catch (err) {
    // res.send(err, 500);
    console.log(err);
  }
  await browser.close();
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-03-31 21:14:48

基于您在注释中提供的Quora,我检索了容器盒的CSS类,即.q-sticky。它有助于更容易地找到内部元素(链接和链接文本)。

使用子组合器通用选择器,您可以编写能够获取所需元素的模式:

  • 框内的所有链接:'.q-sticky * > a'
  • 框内所有链接文本:'.q-sticky * > .q-box.qu-userSelect--text'

注意:您的初始代码有一个异步问题:const related = page.$eval(...,您应该在其中await page.$eval以避免错误(傀儡方法主要是返回承诺,可以通过等待它们来处理)。

page.$eval不同,您可以使用它的page.$$eval变体(即"querySelectorAll“版本)来返回具有相同选择器的元素数组。

最后,您可以根据需要组合这两个数组(我在下面使用了Array.map oneliner )。

代码语言:javascript
复制
await page.waitForSelector('.q-sticky * > a');
const relatedLinks = await page.$$eval('.q-sticky * > a', elems => elems.map((el) => el.href));
const relatedTitles = await page.$$eval('.q-sticky * > .q-box.qu-userSelect--text', elems => elems.map((el) => el.innerText));
    
const related = relatedLinks.map((linkel, i) => { return { link: linkel, title: relatedTitles[i] }});
console.log(related);
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71670425

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档