文章/答案/技术大牛

发布

社区首页 >问答首页 >Python TypeError回溯(最近一次调用)

问Python TypeError回溯(最近一次调用)
EN

Stack Overflow用户

提问于 2016-09-23 01:42:35

回答 2查看 10.1K关注 0票数 0

我正在尝试构建一个爬虫程序，并且我想使用Python 3.5打印该页面上的所有链接

这是我的代码

import requests
from bs4 import BeautifulSoup
def crawler(link):
    source_code = requests.get(link)
    source_code_string = str(source_code)
    source_code_soup = BeautifulSoup(source_code_string,'lxml')
    for item in source_code_soup.findAll("a"):
        title = item.string
        print(title)

crawler("https://www.youtube.com/watch?v=pLHejmLB16o")

但是我得到的错误是这样的

TypeError                                 Traceback (most recent call last)
<ipython-input-13-9aa10c5a03ef> in <module>()
----> 1 crawler('http://archive.is/DPG9M')

TypeError: 'module' object is not callable

python

web-crawler

回答 2

Stack Overflow用户

发布于 2016-09-23 02:26:05

如果您的意图是只打印链接的标题，那么您正在犯一个小错误，请替换该行：

source_code_string = str(source_code)

使用

source_code_string = source_code.text

除此之外，代码看起来很好，并且正在运行。让我们将该文件命名为web_crawler_v1.py

import requests
from bs4 import BeautifulSoup
def crawler(link):
    source_code = requests.get(link)
    source_code_string = source_code.text 
    source_code_soup = BeautifulSoup(source_code_string,'lxml')
    for item in source_code_soup.findAll("a"):
        title = item.string
        print(title)


crawler("https://www.youtube.com/watch?v=pLHejmLB16o")

关于那个错误，如果你像这样正确地调用这个文件，你不应该得到这个错误

python3 wen_crawler_v1.py

票数 2

Stack Overflow用户

发布于 2021-01-29 03:37:03

而不是

source_code = requests.get(link)

使用：

source_code = requests.get(link, verify = False)

您将收到HTTPS警告，但代码将执行

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/39645563

复制

相似问题

问Python TypeError回溯(最近一次调用)
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python TypeError回溯(最近一次调用)EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python TypeError回溯(最近一次调用)
EN