首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Coursera URL web抓取

Coursera URL web抓取
EN

Stack Overflow用户
提问于 2020-08-12 14:18:16
回答 1查看 388关注 0票数 0

我有python代码,可以刮课程的细节,如course_title,评级,学生数量等,但我想课程链接以及。谁能帮助我如何从coursera获取每个课程的URL。

EN

回答 1

Stack Overflow用户

发布于 2020-08-12 15:23:01

我看了一下coursera.org,也找到了抓取课程网址的解决方案。

这里是你想要做的:

从元素列表中抓取属性为data-click-key = search.search.click.search_card.

  • Make的所有a元素。
  1. 抓取每个元素的href列表。

代码如下:

代码语言:javascript
复制
#Assume that you searched for python courses
base = "https://www.coursera.org"
titles = soup.find_all("h2", class_="card-title")
urls = soup.find_all("a", attrs={"data-click-key": "search.search.click.search_card"})
#Incase you need a list of URLs
url_list = [i['href'] for i in urls]

for title, url in zip(titles, urls):
    print(title.text + ": " + base + url['href'])
    
Output:

Python for Everybody: https://www.coursera.org/specializations/python
Python 3 Programming: https://www.coursera.org/specializations/python-3-programming
IBM Data Science: https://www.coursera.org/professional-certificates/ibm-data-science
Google IT Automation with Python: https://www.coursera.org/professional-certificates/google-it-automation
Applied Data Science with Python: https://www.coursera.org/specializations/data-science-python
Programming for Everybody (Getting Started with Python): https://www.coursera.org/learn/python
Crash Course on Python: https://www.coursera.org/learn/python-crash-course
Python for Data Science and AI: https://www.coursera.org/learn/python-for-applied-data-science-ai
Introducción a la programación en Python I: Aprendiendo a programar con Python: https://www.coursera.org/learn/aprendiendo-programar-python
Python Basics: https://www.coursera.org/learn/python-basics
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/63370865

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档