我的Django模型有一个DateTimeField和一个JSONField字段,我想用它们创建一个timeseries dataframe。
包django-pandas有一个方法to_timeseries来完成这个任务,但是我的问题是它将所有的JSONField字段放入一个列中。如何将该列压缩为多索引列?
models.py
class Indicator(models.Model):
dt = models.DateTimeField(null=True)
metrics = models.JSONField(default=dict)JSONField字典:
{'housing': {'1d_percent': 73.62755998, '2d_percent': 3e-08},
'fund-flower': {'ratio': 0.01981295},
'mpi': {'mpi': -0.6527736158660562}}将queryset转换为timeseries dataframe:
>> qs = Indicator.objects.all()
>> df = qs.to_timeseries(index='dt', fieldnames='metrics').sort_index().dropna()
>> df
metrics
dt
2018-01-01 00:00:00+00:00 {'mpi': {'mpi': -0.01679772442974948}, 'fund-f...
2018-01-02 00:00:00+00:00 {'mpi': {'mpi': 1.1785319016689795}, 'fund-flo...
2018-01-03 00:00:00+00:00 {'mpi': {'mpi': 1.047678402830424}, 'fund-flow...
2018-01-04 00:00:00+00:00 {'mpi': {'mpi': 1.111703887319459}, 'fund-flow...
2018-01-05 00:00:00+00:00 {'mpi': {'mpi': 2.3908629334035343}, 'fund-flo...
...
2022-09-17 00:00:00+00:00 {'mpi': {'mpi': -1.0434999082318062}, 'fund-fl...
2022-09-18 00:00:00+00:00 {'mpi': {'mpi': -0.9680468633746766}, 'fund-fl...
2022-09-19 00:00:00+00:00 {'mpi': {'mpi': -0.9287818619840235}, 'fund-fl...
2022-09-20 00:00:00+00:00 {'mpi': {'mpi': -0.8487296227267782}, 'fund-fl...这是所需的输出:
mpi fund-flower housing
dt mpi ratio 1d_percent 2d_percent
2018-01-01 00:00:00+00:00 value value value value
2018-01-02 00:00:00+00:00 value value value value
2018-01-03 00:00:00+00:00 value value value value
2018-01-04 00:00:00+00:00 value value value value
2018-01-05 00:00:00+00:00 value value value value
...
2022-09-17 00:00:00+00:00 value value value value
2022-09-18 00:00:00+00:00 value value value value
2022-09-19 00:00:00+00:00 value value value value
2022-09-20 00:00:00+00:00 value value value value 我尝试过用json_normalize规范数据,就像提到的这里一样,但是它引发了一个错误:
>> pd.json_normalize(df, record_path =['metrics'])
TypeError: string indices must be integers发布于 2022-09-22 09:24:11
我终于找到了我自己问题的答案。
df = pd.DataFrame()
for obj in qs:
tmp = pd.DataFrame.from_dict(obj.metrics, orient="columns").stack().to_frame().T
df = pd.concat([df, m])
df.index = qs.values_list('dt')https://stackoverflow.com/questions/73811337
复制相似问题