文章/答案/技术大牛

发布

社区首页 >问答首页 >如何对使用框架的老学校网站进行网络抓取

问如何对使用框架的老学校网站进行网络抓取
EN

Stack Overflow用户

提问于 2022-11-07 18:46:27

回答 1查看 30关注 0票数 0

我试图在网上刮一个使用框架集的政府网站。以下是URL - https://lakecounty.in.gov/departments/voters/election-results-c/2022GeneralElectionResults/index.htm

我试过用碎片/硒

url = "https://lakecounty.in.gov/departments/voters/election-results-c/2022GeneralElectionResults/index.htm"

browser.visit(url)

time.sleep(10)

full_xpath_frame = '/html/frameset/frameset/frame[2]'

tree = browser.find_by_xpath(full_xpath_frame)

for i in tree:
    print(i.text)

它只返回一个空字符串。我试过使用请求库。

import requests
from lxml import HTML

url = "https://lakecounty.in.gov/departments/voters/election-results-c/2022GeneralElectionResults/index.htm"

# get response object
response = requests.get(url)
 
# get byte string
data = response.content
print(data)

它会返回这个

b"<html>\r\n<head>\r\n<meta http-equiv='Content-Type'\r\ncontent='text/html; charset=iso-

8859-1'>\r\n<title>Lake_ County Election Results</title>\r\n</head>\r\n<FRAMESET rows='20%,

 *'>\r\n<FRAME src='titlebar.htm' scrolling='no'>\r\n<FRAMESET cols='20%, *'>\r\n<FRAME 

src='menu.htm'>\r\n<FRAME src='Lake_ElecSumm_all.htm' name='reports'>\r\n</FRAMESET>

\r\n</FRAMESET>\r\n<body>\r\n</body>\r\n</html>\r\n"

我也试过用漂亮的汤，它也给了我同样的东西。我是否还可以使用另一个python库来获取第二个表中的数据？

谢谢你的反馈。

web-scraping

python

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-11-07 19:18:50

如前所述，您可以使用框架及其src：

BeautifulSoup(r.text).select('frame')[1].get('src')

或者直接到menu.htm

import requests
from bs4 import BeautifulSoup

r = requests.get('https://lakecounty.in.gov/departments/voters/election-results-c/2022GeneralElectionResults/menu.htm')

link_list = ['https://lakecounty.in.gov/departments/voters/election-results-c/2022GeneralElectionResults'+a.get('href') for a in BeautifulSoup(r.text).select('a')]

for link in link_list[:1]:
    r = requests.get(link)
    soup = BeautifulSoup(r.text)
    ###...scrape what is needed

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/74351433

复制

相似问题

问如何对使用框架的老学校网站进行网络抓取
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何对使用框架的老学校网站进行网络抓取EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何对使用框架的老学校网站进行网络抓取
EN