import requests
from bs4 import BeautifulSoup


url = 'https://sec.report/CIK/1418076'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'}

soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')

print( soup.find('td', text="SIC").find_next('td').text )

指纹：

7129: Other Business Financing Companies Investors, Not Elsewhere Classified 6799

编辑:将解析器更改为lxml，以正确解析HTML：

import requests
from bs4 import BeautifulSoup

url = 'https://sec.report/CIK/1002771'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'}

soup = BeautifulSoup(requests.get(url, headers=headers).content, 'lxml')

print( soup.find('td', text="SIC").find_next('td').text )

指纹：

1121: Distillery Products Industry Pharmaceutical Preparations 2834

票数 2

Stack Overflow用户

发布于 2020-07-23 16:25:05

试试下面的代码：

import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 '}
r = requests.get('https://sec.report/CIK/1418076', headers=headers)

soup = BeautifulSoup(r.content, 'lxml')

sic = soup.select_one('.table:nth-child(5) tr~ tr+ tr td:nth-child(2)')
print(sic.text)

输出：

7129: Other Business Financing Companies Investors, Not Elsewhere Classified 6799

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/63058466

复制

相似问题

问我怎么才能从这张桌子上刮出锡克的描述呢？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我怎么才能从这张桌子上刮出锡克的描述呢？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我怎么才能从这张桌子上刮出锡克的描述呢？
EN