文章/答案/技术大牛

发布

社区首页 >问答首页 >关于计算数据帧中每个组的汇总统计信息

问关于计算数据帧中每个组的汇总统计信息
EN

Stack Overflow用户

提问于 2018-08-12 10:50:06

回答 1查看 49关注 0票数 0

我有一个dataframe，它的列如下

ID                               Time                          Price
1002                     1998-01-02                    34
2001                     1998-02-03                   45
1002                     1998-04-05                    23
2003                      1998-02-03                   30
1002                       1998-02-03                   60

基于这个数据帧，我想创建另一个数据帧，它有三列，“ID”，“period-1”，“period-2”。每条条目为对应时间段内ID的平均值)

新的数据帧应该如下所示。如何在Pandas中做到这一点？

ID                period-1(1998-01-01:1998-02-01)             period 2(1998-02-02-1998-05-02)
1002     
2001
2003

下面是我按照建议得到的代码，但有一些错误

import pandas as pd

df=pd.DataFrame({"ID": ["1002", "2001", "1002", "2003", "1002"],
                "Time": ["1998-01-02", "1998-02-03", "1998-04-05", "1998-02-03", "1998-02-03"],
                 "Price": ["34", "45", "23", "30","60"]})


df.Time=pd.to_datetime(df.Time)
period2=pd.Interval(pd.Timestamp('1998-02-02'), pd.Timestamp('1998-05-02'), closed='both')
df['Price'].apply(pd.to_numeric)
df['New']='period1'


df.loc[df.Time.apply(lambda x : x in period2),'New']='period2'


df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')


 306             # people may try to aggregate on a non-callable attribute

~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in mean(self, *args, **kwargs)
   1304         nv.validate_groupby_func('mean', args, kwargs, ['numeric_only'])
   1305         try:
-> 1306             return self._cython_agg_general('mean', **kwargs)
   1307         except GroupByError:
   1308             raise

~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
   3972                             min_count=-1):
   3973         new_items, new_blocks = self._cython_agg_blocks(
-> 3974             how, alt=alt, numeric_only=numeric_only, min_count=min_count)
   3975         return self._wrap_agged_blocks(new_items, new_blocks)
   3976 

~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
   4044 
   4045         if len(new_blocks) == 0:
-> 4046             raise DataError('No numeric types to aggregate')
   4047 
   4048         # reset the locs in the blocks to correspond to our

DataError: No numeric types to aggregate

numpy

python-3.x

pandas

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-08-12 11:02:49

通过在pivot_table中使用Interval

#df.Time=pd.to_datetime(df.Time)

period2=pd.Interval(pd.Timestamp('1998-02-02'), pd.Timestamp('1998-05-02'), closed='both')


df['New']='period1'

df.loc[df.Time.apply(lambda x : x in period2),'New']='period2'

df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')
Out[881]: 
New   period1  period2
ID                    
1002     34.0     41.5
2001      NaN     45.0
2003      NaN     30.0

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51805292

复制

相似问题

问关于计算数据帧中每个组的汇总统计信息
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问关于计算数据帧中每个组的汇总统计信息EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问关于计算数据帧中每个组的汇总统计信息
EN