文章/答案/技术大牛

发布

社区首页 >问答首页 >使用Beautiful Soup跟踪链接？

问使用Beautiful Soup跟踪链接？
EN

Stack Overflow用户

提问于 2021-03-22 02:58:08

回答 2查看 69关注 0票数 0

所以我刚刚开始使用Coursera在线课程"Python for Everybody“学习python，我有一个作业，我必须使用漂亮的汤来访问链接。我以前看到过这个问题，但当我尝试使用它时，它就是不起作用。我设法创建了一些东西，但它实际上并没有通过链接，而是停留在同一个页面上。如果可能的话，有没有人可以提供更好的材料来更好地了解这项任务？谢谢。

import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup
import ssl

ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

url = input('Enter URL - ')
cnt = input("Enter count -")
count = int(cnt)
pn = input("Enter position -")
position = int(pn)-1

while count > 0:
    html = urllib.request.urlopen(url, context=ctx).read()
    soup = BeautifulSoup(html, "html.parser")
    tags = soup('a')
    lst = list()
    for tag in tags:
        lst.append(tag.get('href', None))
    indxpos = lst[position]
    count = count - 1
    print("Retrieving:", indxpos)

python

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-03-22 03:05:27

您永远不会将url设置为新的URL。

while count > 0:
    html = urllib.request.urlopen(url, context=ctx).read()  # Gets the page at url
    ...
    for tag in tags:
        lst.append(tag.get('href', None))  # Appends all the links to lst
    indxpos = lst[position]
    count = count - 1
    print("Retrieving:", indxpos)
    # What happens to lst?? you never use it

您可能应该将indxpos替换为url。

while count > 0:
    html = urllib.request.urlopen(url, context=ctx).read()  # Gets the page at url
    ...
    for tag in tags:
        lst.append(tag.get('href', None))  # Appends all the links to lst
    url = lst[position]
    count = count - 1
    print("Retrieving:", url)

这样，下一次循环运行时，它将获取新的URL。

另外:如果页面没有pn链接(例如pn=12，页面有2个链接)，如果你尝试访问lst[position]，你会得到一个异常，因为lst的元素少于pn。

票数 1

Stack Overflow用户

发布于 2021-03-22 03:04:20

无论如何，您没有一个函数可以与代码中的超链接列表进行交互。它将只打印"lst“列表的内容，但不会对它们做任何事情。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/66736346

复制

相似问题

问使用Beautiful Soup跟踪链接？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Beautiful Soup跟踪链接？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Beautiful Soup跟踪链接？
EN