下面是我的代码,我想从一个网站中抓取一个单词列表,但是当我在
import requests
from bs4 import BeautifulSoup
url = "https://www.merriam-webster.com/browse/thesaurus/a"
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text, "html.parser")
entry_view = soup.find_all('div', {'class': 'entries'})
view = entry_view[0]
list = view.ul
for m in list:
for x in m:
title = x.string
print(title)我想要的是一个从网站打印文本的列表,但我得到的是一个错误。
Traceback (most recent call last):
File "/home/vidu/PycharmProjects/untitled/hello.py", line 14, in <module>
title = x.string
AttributeError: 'str' object has no attribute 'string'
Error in sys.excepthook:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in apport_excepthook
from apport.fileutils import likely_packaged, get_recent_crashes
File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in <module>
from apport.report import Report
File "/usr/lib/python3/dist-packages/apport/report.py", line 30, in <module>
import apport.fileutils
File "/usr/lib/python3/dist-packages/apport/fileutils.py", line 23, in <module>
from apport.packaging_impl import impl as packaging
File "/usr/lib/python3/dist-packages/apport/packaging_impl.py", line 23, in <module>
import apt
File "/usr/lib/python3/dist-packages/apt/__init__.py", line 23, in <module>
import apt_pkg
ModuleNotFoundError: No module named 'apt_pkg'
Original exception was:
Traceback (most recent call last):
File "/home/vidu/PycharmProjects/untitled/hello.py", line 14, in <module>
title = x.string
AttributeError: 'str' object has no attribute 'string'发布于 2017-03-30 12:09:30
您可以通过使用以下代码来实现您想要的结果。
代码:
import requests
from bs4 import BeautifulSoup
url = "https://www.merriam-webster.com/browse/thesaurus/a"
html_source = requests.get(url).text
soup = BeautifulSoup(html_source, "html.parser")
entry_view = soup.find_all('div', {'class': 'entries'})
entries = []
for elem in entry_view:
for e in elem.find_all('a'):
entries.append(e.text)
#show only 5 elements and whole list length
print(entries[:5])
print(entries[-5:])
print(len(entries))输出:
['A1', 'aback', 'abaft', 'abandon', 'abandoned']
['absorbing', 'absorption', 'abstainer', 'abstain from', 'abstemious']
100代码中的:
print(type(list))
<class 'bs4.element.Tag'>
print(type(m))
<class 'bs4.element.NavigableString'>
print(type(x))
<class 'str'>因此,正如您所看到的,变量x已经是一个字符串,所以使用.string()是没有意义的。
p.s.:您不应该使用像list这样的变量名,它是一个保留关键字。
发布于 2017-03-30 09:10:48
AttributeError:'str‘对象没有属性'string’
这是告诉您对象已经是字符串了。试着移除它,它应该能工作。
它还告诉您,字符串数据类型的正确语法是str而不是string。
另一件事情是使用title = str(x)进行转换,但是由于在本例中它已经是一个字符串,所以它是多余的。
引用谷歌的话:
Python有一个名为"str“的内置字符串类,具有许多方便的特性(有一个名为" string”的旧模块,您不应该使用它)。
https://stackoverflow.com/questions/43113232
复制相似问题