首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >NoneType不可迭代,在使用Python3.8进行web抓取时会导致错误

NoneType不可迭代,在使用Python3.8进行web抓取时会导致错误
EN

Stack Overflow用户
提问于 2020-04-05 04:38:49
回答 1查看 268关注 0票数 0

我目前被分配到制作一个网页刮刀拉链接。我可以成功地提取这些数据:

代码语言:javascript
复制
/
/users/sign_up
/topics
/smarties
/posts
/users/sign_in
/users/sign_up
/posts/installing-anaconda-python-data-science-platform
/topics/python
/topics/anaconda-python
/topics/machine-learning
/jordan
/posts/python-libraries-to-import-for-data-science-programs
/topics/python
/topics/data-science
/topics/machine-learning
/jordan
/posts/shortcut-for-opening-the-object-inspector-in-python-spyder
/topics/python
/topics/anaconda-python
/topics/spyder-python
/topics/machine-learning
/jordan
/posts/python-script-for-replacing-missing-data-in-a-machine-learning-algorithm
/topics/machine-learning
/topics/python
/jordan
/posts/python-script-for-pulling-in-the-same-column-from-an-array-of-arrays
/topics/python
/jordan
/posts/how-to-implement-fizzbuzz-in-python
/topics/fizzbuzz
/topics/python
/jordan
/posts/how-to-think-like-a-computer-scientist
/topics/computer-science
/topics/python
/topics/programming
/jordan
/posts/base-case-example-for-how-to-test-a-python-class
/topics/python
/topics/tdd
/jordan
/posts/installing-and-working-with-pipenv
/topics/pipenv
/topics/python
/jordan
/posts/steps-for-building-a-flask-api-application-with-python-3
/topics/flask
/topics/tutorial
/topics/python
/jordan
None
/topics/python?page=2
/topics/python?page=3
/topics/python?page=4
/topics/python?page=2
/topics/python?page=4

在我运行这段代码之后

代码语言:javascript
复制
import requests
from bs4 import BeautifulSoup as bs

r = requests.get('http://www.dailysmarty.com/topics/python')

soup = bs(r.text, 'html.parser')

for link in soup.find_all('a'):
    print(link.get('href'))

但当我启动我正在工作的发电机时:

代码语言:javascript
复制
def generator(web):
    titles = []
    for link in web:
        if 'posts' in link.get('href'):
            print(link.get('href'))
        else:
            pass


data = soup.find_all('a')
#generator(data)

我得到这些数据和这些回调错误:

代码语言:javascript
复制
/posts
/posts/installing-anaconda-python-data-science-platform
/posts/python-libraries-to-import-for-data-science-programs
/posts/shortcut-for-opening-the-object-inspector-in-python-spyder
/posts/python-script-for-replacing-missing-data-in-a-machine-learning-algorithm
/posts/python-script-for-pulling-in-the-same-column-from-an-array-of-arrays
/posts/how-to-implement-fizzbuzz-in-python
/posts/how-to-think-like-a-computer-scientist
/posts/base-case-example-for-how-to-test-a-python-class
/posts/installing-and-working-with-pipenv
/posts/steps-for-building-a-flask-api-application-with-python-3
Traceback (most recent call last):
  File "C:\Users\joshu\AppData\Local\Programs\Python\Python38\classes.py", line 18, in <module>
    generator(data)
  File "C:\Users\joshu\AppData\Local\Programs\Python\Python38\classes.py", line 13, in generator
    if 'posts' in link.get('href'):
TypeError: argument of type 'NoneType' is not iterable

如何才能使其在运行生成器时,在for循环中通过任何一个,而不会导致代码中出现错误?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-04-05 04:40:44

您必须检查链接是否确实具有"href"属性:

代码语言:javascript
复制
if link.has_attr('href') and 'posts' in link.get('href'):
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/61038147

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档