文章/答案/技术大牛

发布

问不能Web抓取Tripadvisor
EN

Stack Overflow用户

提问于 2022-02-19 03:29:58

回答 1查看 484关注 0票数 -1

我尝试在Tripadvisor上抓取“要做的事情”(例如，链接是Texas.html)。但我坚持了最初的几个代码。我等了十多分钟，没有回应。三天前我试过代码和链接，它成功了。但现在，它什么也没有产生。守则是：

import requests
trip = 'https://www.tripadvisor.com/Tourism-g30196-Austin_Texas-Vacations.html'
response = requests.get(trip)
print(type(response))

我的代码和结果

我不知道这是怎么回事。期待您的帮助！非常感谢。

python

web-scraping

jupyter-notebook

tripadvisor

回答 1

Stack Overflow用户

发布于 2022-02-19 05:26:17

首先，您应该尝试从真正的web浏览器中设置头部User-Agent (一开始可以尝试更短的Mozilla/5.0)，因为requests发送类似于python/3.8 requests/2.x的内容，而服务器可以识别脚本并阻止它。有些服务器也需要这样才能为不同的浏览器或设备(桌面、平板电脑、电话)发送不同的内容。

import requests
from bs4 import BeautifulSoup

#url = 'https://www.tripadvisor.com/Tourism-g30196-Austin_Texas-Vacations.html'

url = 'https://www.tripadvisor.com/Attractions-g30196-Activities-c57-Austin_Texas.html'

response = requests.get(url, headers={'User-Agent': "Mozilla/5.0"})

soup = BeautifulSoup(response.text, 'html.parser')

items = soup.find_all('span', {'name': 'title'})

for i in items:
    print(i.text)

结果：

1. Lady Bird Lake Hike-and-Bike Trail
2. Barton Springs Pool
3. Mount Bonnell
4. Congress Avenue Bridge / Austin Bats
5. Lady Bird Johnson Wildflower Center
6. Austin Aquarium
7. Zilker Metropolitan Park
8. McKinney Falls State Park
9. Barton Creek Greenbelt
10. Austin Zoo
11. Mayfield Park
12. Zilker Botanical Garden
13. Town Lake
14. Westcave Outdoor Discovery Center
15. Bull Creek District Park
16. Austin Nature & Science Center
17. Turkey Creek Trail
18. River Place Nature Trails
19. Mueller Lake Park
20. Zilker Playground
21. Deep Eddy Pool
22. Red Bud Isle Park
23. Mansfield Dam Park
24. Pease Park
25. Wild Basin Preserve
26. Emma Long Metropolitan Park
27. Shoal Creek Greenbelt
28. Commons Ford Ranch
29. Hornsby Bend Bird Observatory
30. Mary Moore Searight Metropolitan Park

编辑：

在我的GitHub 刮擦中，您可以从堆栈溢出的其他答案中找到代码，它使用selenium和scrapy刮取tripadvisor。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/71181932

复制

相似问题

问不能Web抓取Tripadvisor
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问不能Web抓取TripadvisorEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问不能Web抓取Tripadvisor
EN