首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >我怎么刮sic代码的描述?

我怎么刮sic代码的描述?
EN

Stack Overflow用户
提问于 2020-06-21 21:21:15
回答 1查看 386关注 0票数 2

嗨,我正在用BS4刮sic代码和描述。我现在有下面的代码,这完全是我想要的,但我不知道如何刮除下面的描述图片,检查元素视图以及视图源代码。

要弄清楚我想要的是“国有商业银行”和“实验室分析仪器”。

https://www.sec.gov/cgi-bin/browse-edgar?CIK=866054&owner=exclude&action=getcompany&Find=Search

代码语言:javascript
复制
<div class="companyInfo">
     <span class="companyName">COMMERCIAL NATIONAL FINANCIAL CORP /PA <acronym title="Central Index Key">CIK</acronym>#: <a href="/cgi-bin/browse-edgar?action=getcompany&amp;CIK=0000866054&amp;owner=exclude&amp;count=40">0000866054 (see all company filings)</a></span>
     <p class="identInfo"><acronym title="Standard Industrial Code">SIC</acronym>: <a href="/cgi-bin/browse-edgar?action=getcompany&amp;SIC=6022&amp;owner=exclude&amp;count=40">6022</a> - STATE COMMERCIAL BANKS<br />State location: <a href="/cgi-bin/browse-edgar?action=getcompany&amp;State=PA&amp;owner=exclude&amp;count=40">PA</a> | State of Inc.: <strong>PA</strong> | Fiscal Year End: 1231<br />(Office of Finance)<br />Get <a href="/cgi-bin/own-disp?action=getissuer&amp;CIK=0000866054"><b>insider transactions</b></a> for this <b>issuer</b>.

代码语言:javascript
复制
for cik_num in cik_num_list:
try:
    url = r"https://www.sec.gov/cgi-bin/browse-edgar?CIK={}&owner=exclude&action=getcompany".format(cik_num)
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')
    try:
        comp_name = soup.find_all('div', {'class':'companyInfo'})[0].find('span').text
        sic_code = soup.find_all('p', {'class':'identInfo'})[0].find('a').text

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-06-21 21:37:22

代码语言:javascript
复制
import requests
from bs4 import BeautifulSoup

url = 'https://www.sec.gov/cgi-bin/browse-edgar?CIK=866054&owner=exclude&action=getcompany&Find=Search'

soup = BeautifulSoup(requests.get(url).content, 'html.parser')

sic_code_desc = soup.select_one('.identInfo').a.find_next_sibling(text=True).split(maxsplit=1)[-1]
print(sic_code_desc)

指纹:

代码语言:javascript
复制
STATE COMMERCIAL BANKS

对于url = 'https://www.sec.gov/cgi-bin/browse-edgar?CIK=1090872&owner=exclude&action=getcompany&Find=Search',它打印:

代码语言:javascript
复制
LABORATORY ANALYTICAL INSTRUMENTS
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62504581

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档