首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从日期创建数据帧时出现问题

从日期创建数据帧时出现问题
EN

Stack Overflow用户
提问于 2020-10-20 06:50:04
回答 1查看 28关注 0票数 1

我想从web抓取创建一个数据帧。代码如下所示:

代码语言:javascript
复制
import pandas as pd
import json
import re
from bs4 import BeautifulSoup

symbol = ['FMC','VMC','APD']
short_name = ['fmc','vulcan-materials','air-products-and-chemicals']

url_key_financial_ratio = 'https://www.macrotrends.net/stocks/charts/{}/{}/financial-ratios'


records = []

for i,j in zip(symbol, short_name):
    r = requests.get(url_key_financial_ratio.format(i, j))
    soup = BeautifulSoup(r.text, 'html.parser')
    pattern = re.compile(r' var originalData = (.*?);\r\n\r\n\r',re.DOTALL)
    data = json.loads(p.findall(r.text)[0])

        records.append({
          'symbol' : i,
          'date' : data[0]['field_name']

我希望输出是这样的:-

代码语言:javascript
复制
date         ticker    current_ratio    long_term_debt
2019-09-30   FMC       2.53630          0.22080
2018-09-30   FMC       2.17350          0.23070
2017-09-30   FMC       2.36110          0.25040
2016-09-30   FMC       1.31500          0.35150
2015-09-30   FMC       0.76650          0.34850
2014-09-30   FMC       1.11200          0.39080
2013-09-30   FMC       1.06550          0.41260
2012-09-30   FMC       1.26990          0.40900
2011-09-30   FMC       1.36200          0.39810
2010-09-30   FMC       1.35190          0.39110
2009-09-30   FMC       1.19740          0.42980
2008-09-30   FMC       1.28760          0.41130
2007-09-30   FMC       1.17980          0.35120
2006-09-30   FMC       1.12450          0.31650
2005-09-30   FMC       1.24260          0.31050

我是一个网络抓取的新手,我不能使用append方法来建立这个数据库,有没有人可以帮助我?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-10-20 07:06:22

此脚本将current_rationlong_term_debt列加载到数据帧中:

代码语言:javascript
复制
import json
import pandas as pd
from bs4 import BeautifulSoup

symbol = ['FMC','VMC','APD']
short_name = ['fmc','vulcan-materials','air-products-and-chemicals']

url_key_financial_ratio = 'https://www.macrotrends.net/stocks/charts/{}/{}/financial-ratios'

records = []
for i,j in zip(symbol, short_name):
    r = requests.get(url_key_financial_ratio.format(i, j))
    soup = BeautifulSoup(r.text, 'html.parser')
    pattern = re.compile(r' var originalData = (.*?);\r\n\r\n\r',re.DOTALL)
    data = json.loads(pattern.findall(r.text)[0])

    current_ratio = next(d for d in data if 'Current Ratio' in d['field_name'])
    long_term_debt = next(d for d in data if 'Long-term Debt / Capital' in d['field_name'])

    for (k1, v1), (_, v2) in zip(current_ratio.items(), long_term_debt.items()):
        if k1 in ('field_name', 'popup_icon'):
            continue

        records.append({
          'date' : k1,
          'ticker': i,
          'current_ratio': v1,
          'long_term_debt': v2})

df = pd.DataFrame(records)
print(df)

打印:

代码语言:javascript
复制
          date ticker current_ratio long_term_debt
0   2019-12-31    FMC       1.49590        0.54200
1   2018-12-31    FMC       1.34640        0.40050
2   2017-12-31    FMC       1.65330        0.52510
3   2016-12-31    FMC       1.98110        0.47440
4   2015-12-31    FMC       2.04490        0.51620
5   2014-12-31    FMC       1.53600        0.42140
6   2013-12-31    FMC       1.48240        0.42330
7   2012-12-31    FMC       1.92150        0.36890
8   2011-12-31    FMC       2.03220        0.37400
9   2010-12-31    FMC       1.70870        0.29720
10  2009-12-31    FMC       2.09770        0.34160
11  2008-12-31    FMC       1.88750        0.38020
12  2007-12-31    FMC       1.58920        0.28280
13  2006-12-31    FMC       1.48820        0.34130
14  2005-12-31    FMC       1.61440        0.40010
15  2019-12-31    VMC       2.57550        0.33120
16  2018-12-31    VMC       1.79100        0.34820
17  2017-12-31    VMC       2.66470        0.36150
18  2016-12-31    VMC       3.05490        0.30250
19  2015-12-31    VMC       3.06830        0.30780
20  2014-12-31    VMC       2.03700        0.30520
21  2013-12-31    VMC       3.18080        0.39040
22  2012-12-31    VMC       2.25700        0.40180
23  2011-12-31    VMC       2.12450        0.41420
24  2010-12-31    VMC       1.32870        0.38030
25  2009-12-31    VMC       0.85550        0.34390
26  2008-12-31    VMC       0.53750        0.37730
27  2007-12-31    VMC       0.45770        0.28920
28  2006-12-31    VMC       1.49990        0.13800
29  2005-12-31    VMC       2.04040        0.13160
30  2019-09-30    APD       2.53630        0.22080
31  2018-09-30    APD       2.17350        0.23070
32  2017-09-30    APD       2.36110        0.25040
33  2016-09-30    APD       1.31500        0.35150
34  2015-09-30    APD       0.76650        0.34850
35  2014-09-30    APD       1.11200        0.39080
36  2013-09-30    APD       1.06550        0.41260
37  2012-09-30    APD       1.26990        0.40900
38  2011-09-30    APD       1.36200        0.39810
39  2010-09-30    APD       1.35190        0.39110
40  2009-09-30    APD       1.19740        0.42980
41  2008-09-30    APD       1.28760        0.41130
42  2007-09-30    APD       1.17980        0.35120
43  2006-09-30    APD       1.12450        0.31650
44  2005-09-30    APD       1.24260        0.31050
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64436082

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档