首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何在Python中准备paneldata to机器学习?

如何在Python中准备paneldata to机器学习?
EN

Stack Overflow用户
提问于 2019-10-08 20:04:05
回答 1查看 178关注 0票数 1

我有一个面板数据集/时间序列。我想为明年的gcp准备机器学习预测的数据集。我的数据如下所示:

代码语言:javascript
复制
ID,year,age,area,debt_ratio,gcp
654001,2013,49,East,0.14,0
654001,2014,50,East,0.17,0
654001,2015,51,East,0.23,1
654001,2016,52,East,0.18,0
112089,2013,39,West,0.13,0
112089,2014,40,West,0.15,0
112089,2015,41,West,0.18,1
112089,2016,42,West,0.21,1

我想要的是这样的:

代码语言:javascript
复制
ID,year,age,area,debt_ratio,gcp,gcp-1,gcp-2,gcp-3
654001,2013,49,East,0.14,0,NA,NA,NA
654001,2014,50,East,0.17,0,0,NA,NA
654001,2015,51,East,0.23,1,0,0,NA
654001,2016,52,East,0.18,0,1,0,0
112089,2013,39,West,0.13,0,NA,NA,NA
112089,2014,40,West,0.15,0,0,NA,NA
112089,2015,41,West,0.18,1,0,0,NA
112089,2016,42,West,0.21,1,1,0,0

我试过熊猫熔化功能,但它不起作用。我在网上搜索,发现这篇文章正是我想要做的,但它是用R完成的:

代码语言:javascript
复制
https://stackoverflow.com/questions/19813077/prepare-time-series-for-machine-learning-long-to-wide-format

有人知道如何在Python Pandas中做到这一点吗?任何建议都将不胜感激!

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-10-08 20:10:42

在循环中使用DataFrameGroupBy.shift

代码语言:javascript
复制
for i in range(1, 4):
    df[f'gcp-{i}'] = df.groupby('ID')['gcp'].shift(i)
print (df)
       ID  year  age  area  debt_ratio  gcp  gcp-1  gcp-2  gcp-3
0  654001  2013   49  East        0.14    0    NaN    NaN    NaN
1  654001  2014   50  East        0.17    0    0.0    NaN    NaN
2  654001  2015   51  East        0.23    1    0.0    0.0    NaN
3  654001  2016   52  East        0.18    0    1.0    0.0    0.0
4  112089  2013   39  West        0.13    0    NaN    NaN    NaN
5  112089  2014   40  West        0.15    0    0.0    NaN    NaN
6  112089  2015   41  West        0.18    1    0.0    0.0    NaN
7  112089  2016   42  West        0.21    1    1.0    0.0    0.0

更动态的解决方案是获取最大组数并传递给range

代码语言:javascript
复制
N = df['ID'].value_counts().max()

for i in range(1, N):
    df[f'gcp-{i}'] = df.groupby('ID')['gcp'].shift(i)
print (df)
       ID  year  age  area  debt_ratio  gcp  gcp-1  gcp-2  gcp-3
0  654001  2013   49  East        0.14    0    NaN    NaN    NaN
1  654001  2014   50  East        0.17    0    0.0    NaN    NaN
2  654001  2015   51  East        0.23    1    0.0    0.0    NaN
3  654001  2016   52  East        0.18    0    1.0    0.0    0.0
4  112089  2013   39  West        0.13    0    NaN    NaN    NaN
5  112089  2014   40  West        0.15    0    0.0    NaN    NaN
6  112089  2015   41  West        0.18    1    0.0    0.0    NaN
7  112089  2016   42  West        0.21    1    1.0    0.0    0.0
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/58285992

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档