这个脚本正在打印双重结果,我无法确定问题的答案。
# -*- coding: utf-8 -*-
import requests
from bs4 import BeautifulSoup as bs
word = ("mission")
with requests.Session() as s:
r = s.get('http://www.tabula.ge/en')
soup = bs(r.text)
div = soup.find("div", {"class": "sets"})
for i in div.find_all('li'):
for text in i.find_all('a'):
if word in text.encode('utf-8').strip():
print text.get_text()运行脚本后,我将在打印输出中获得两次结果:
Kandelaki: Georgian UN mission yet to call security council meeting
Kandelaki: Georgian UN mission yet to call security council meeting
发布于 2015-06-16 19:51:57
搜索的内容在页面源中发生两次。
若要查看消息来源:
view-source:http://www.tabula.ge/en中这种情况有两种情况:
<a href="/en/story/90354-kandelaki-georgian-un-mission-yet-to-call-security-council-meeting" data-topic="UN Security Council Meeting" data-video="false" data-date="December 1 2014, 03:13PM" data-comment-count="0" data-thumbnail="http://www.tabula.ge/files/styles/tab_thumb_featured/public/photos/2014/12/giorgi-kandelaki.jpg?itok=uKdw1i9k" data-nid="90354">
Kandelaki: Georgian UN mission yet to call security council meeting </a>https://stackoverflow.com/questions/30876793
复制相似问题