我有python代码,可以刮课程的细节,如course_title,评级,学生数量等,但我想课程链接以及。谁能帮助我如何从coursera获取每个课程的URL。
发布于 2020-08-12 15:23:01
我看了一下coursera.org,也找到了抓取课程网址的解决方案。
这里是你想要做的:
从元素列表中抓取属性为data-click-key = search.search.click.search_card.
a元素。href列表。代码如下:
#Assume that you searched for python courses
base = "https://www.coursera.org"
titles = soup.find_all("h2", class_="card-title")
urls = soup.find_all("a", attrs={"data-click-key": "search.search.click.search_card"})
#Incase you need a list of URLs
url_list = [i['href'] for i in urls]
for title, url in zip(titles, urls):
print(title.text + ": " + base + url['href'])
Output:
Python for Everybody: https://www.coursera.org/specializations/python
Python 3 Programming: https://www.coursera.org/specializations/python-3-programming
IBM Data Science: https://www.coursera.org/professional-certificates/ibm-data-science
Google IT Automation with Python: https://www.coursera.org/professional-certificates/google-it-automation
Applied Data Science with Python: https://www.coursera.org/specializations/data-science-python
Programming for Everybody (Getting Started with Python): https://www.coursera.org/learn/python
Crash Course on Python: https://www.coursera.org/learn/python-crash-course
Python for Data Science and AI: https://www.coursera.org/learn/python-for-applied-data-science-ai
Introducción a la programación en Python I: Aprendiendo a programar con Python: https://www.coursera.org/learn/aprendiendo-programar-python
Python Basics: https://www.coursera.org/learn/python-basicshttps://stackoverflow.com/questions/63370865
复制相似问题