我是Python的新手,我正在尝试制作一个website,它只打印文章(例如这个网站-http://techcrunch.com/2014/09/15/microsoft-has-acquired-minecraft/),而不是网站上的其他东西。我试过这个(但这个不起作用):
source_code = requests.get('http://techcrunch.com/2014/09/15/microsoft-has-acquired-minecraft/')
plain_text = source_code.text
soup = BeautifulSoup(plain_text)
for link in soup.findAll('div', {'class': 'article-entry text'}):
title = link.string
print(title)它的印子:“没有”
发布于 2014-09-15 19:27:02
您只需要for循环的文章:
for link in soup.findAll('div', {'class': 'article-entry text'}):
title = link.string
print(title)使:
title = soup.find('h1', {'class': 'alpha tweet-title'}).get_text()
article = soup.find('div', {'class': 'article-entry text'}.get_text()
print title
print article你只会得到标题和文章。
有关BeautifulSoup的文档可能会有所帮助。
https://stackoverflow.com/questions/25855338
复制相似问题