首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用BS4的Python语言中的WebScraping -获取动态生成的列表

使用BS4的Python语言中的WebScraping -获取动态生成的列表
EN

Stack Overflow用户
提问于 2019-12-08 02:48:01
回答 1查看 97关注 0票数 1

我需要抓取这个列表中的“最佳编码训练营”列表:https://www.switchup.org/rankings/best-coding-bootcamps

我的任务是应该可以使用Beautiful Soup (而不是Selenium),然而,当我尝试这样做时,结果HTML不会返回训练营的列表,而是返回一个似乎是类的空元素:

我的问题是,您认为只有使用Beautiful Soup才能检索到这些内容,而不需要求助于Selenium吗?如果需要Selenium,那么执行此操作的简单代码是什么?

到目前为止,代码非常简单:

代码语言:javascript
复制
from bs4 import BeautifulSoup

import requests

import time

url = "https://www.switchup.org/rankings/best-coding-bootcamps"

r = requests.get(url)


soup = BeautifulSoup(r.content,'lxml')
time.sleep(5)

print(soup)

非常感谢你提前

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-12-08 03:00:32

您说得对,您发布的URL处的页面是空的。数据是通过AJAX从另一个URL加载的。

如果您查看Firefox/Chrome中的Network选项卡,您可以找到此URL (数据为JSON格式):

代码语言:javascript
复制
import requests
from bs4 import BeautifulSoup

url = 'https://www.switchup.org/chimera/v1/bootcamp-list?mainTemplate=bootcamp-list%2Frankings&path=%2Frankings%2Fbest-coding-bootcamps&isDataTarget=false&featuredSchools=0&logoTag=logo&logoSize=original&numSchools=0&perPage=0&rankType=BootcampRankings&rankYear=2020&recentReview=true&reviewLength=50&numLocations=5&numSubjects=5&numCourses=5&sortOn=name&withReviews=false'

data = requests.get(url).json()

for i, bootcamp in enumerate(data['content']['bootcamps'], 1):
    soup = BeautifulSoup(bootcamp['description'], 'html.parser')
    print('{}. {}'.format(i, bootcamp['name']))
    print(soup.get_text(strip=True))
    print('-' * 80)

打印:

代码语言:javascript
复制
1. Le Wagon
Le Wagon is an intensive international coding bootcamp geared toward career changers and entrepreneurs who want to gain coding skills. Participants complete 450 hours of coding in 9 weeks full-time or 24 weeks part-time, which includes building their own web app. After completing the program, students join an international alumni network of 6,000+ for career support and community.
--------------------------------------------------------------------------------
2. App Academy
App Academy teaches participants everything they need to know about software engineering in just 12 weeks. Their full-time bootcamps have helped over 2,000 graduates find jobs at more than 850 companies. Their deferred tuition plan means participants pay for the program only after they’ve landed their first web development job.
--------------------------------------------------------------------------------
3. Ironhack
Ironhack offers two full-time bootcamps focused on web design, a 26-week program in web development and a nine-week program in user experience and user interface design. Students can access extensive career development services post-graduation including portfolio building and interview practice; scholarships are available for underrepresented populations and veterans.
--------------------------------------------------------------------------------

...and so on.
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/59229182

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档