文章/答案/技术大牛

发布

社区首页 >问答首页 >pandas resample命令继续运行

问pandas resample命令继续运行
EN

Stack Overflow用户

提问于 2020-04-08 01:18:48

回答 1查看 33关注 0票数 0

我的DataFrame看起来像

trip_day_df.head
Out[18]: 
<bound method NDFrame.head of              
        INSERTED_UTC        VALUE
0 2017-11-03 10:30:31.430    981
1 2017-09-25 22:15:26.757   2787
2 2017-12-17 23:49:24.880   2591
3 2019-02-04 23:07:30.083  45544
4 2019-01-12 11:35:32.657    504>

我想按行对值进行分组，并对“INSERTED_UTC”求和。期望输出

INSERTED_UTC    VALUE
2017-12-31      6359
2018-12-31      0
2019-12-31      46048

trip_day_df.dtypes
Out[11]: 
INSERTED_UTC    datetime64[ns]
VALUE                   object

trip_day_df.iloc[0,1]
Out[12]: '981'

print(type(trip_day_df.iloc[0,1]))
<class 'str'>

当我运行该命令时，要按年份对INSERTED_UTC进行分组，并对count的值求和，该命令将继续运行。

df_year = trip_day_df.resample('Y', on='INSERTED_UTC').sum()

数据最初有超过一百万行，当我在5行的小尺寸上运行时，它给出了一个奇怪的输出。它只是将值排列在一起，而不是求和

INSERTED_UTC    VALUE
2017-12-31  27879812591
2018-12-31  0
2019-12-31  50445544

python

pandas

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-04-08 01:52:44

我觉得问题出在'VALUE‘列是一个字符串

print(type(trip_day_df.iloc[0,1]))
<class 'str'>

我将它的Datatype改为float by

trip_day_df['VALUE'] = pd.to_numeric(trip_day_df['VALUE'])

已更改数据类型，

trip_day_df.dtypes
Out[44]: 
INSERTED_UTC    datetime64[ns]
VALUE                    int64
dtype: object

现在,

trip_day_df.resample('Y', on='INSERTED_UTC').sum()
Out[47]: 
              VALUE
INSERTED_UTC       
2017-12-31     6359
2018-12-31        0
2019-12-31    46048

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/61085695

复制

相似问题

问pandas resample命令继续运行
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问pandas resample命令继续运行EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问pandas resample命令继续运行
EN