首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >关于计算数据帧中每个组的汇总统计信息

关于计算数据帧中每个组的汇总统计信息
EN

Stack Overflow用户
提问于 2018-08-12 10:50:06
回答 1查看 49关注 0票数 0

我有一个dataframe,它的列如下

代码语言:javascript
复制
ID                               Time                          Price
1002                     1998-01-02                    34
2001                     1998-02-03                   45
1002                     1998-04-05                    23
2003                      1998-02-03                   30
1002                       1998-02-03                   60

基于这个数据帧,我想创建另一个数据帧,它有三列,“ID”,“period-1”,“period-2”。每条条目为对应时间段内ID的平均值)

新的数据帧应该如下所示。如何在Pandas中做到这一点?

代码语言:javascript
复制
ID                period-1(1998-01-01:1998-02-01)             period 2(1998-02-02-1998-05-02)
1002     
2001
2003         

下面是我按照建议得到的代码,但有一些错误

代码语言:javascript
复制
import pandas as pd

df=pd.DataFrame({"ID": ["1002", "2001", "1002", "2003", "1002"],
                "Time": ["1998-01-02", "1998-02-03", "1998-04-05", "1998-02-03", "1998-02-03"],
                 "Price": ["34", "45", "23", "30","60"]})


df.Time=pd.to_datetime(df.Time)
period2=pd.Interval(pd.Timestamp('1998-02-02'), pd.Timestamp('1998-05-02'), closed='both')
df['Price'].apply(pd.to_numeric)
df['New']='period1'


df.loc[df.Time.apply(lambda x : x in period2),'New']='period2'


df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')


 306             # people may try to aggregate on a non-callable attribute

~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in mean(self, *args, **kwargs)
   1304         nv.validate_groupby_func('mean', args, kwargs, ['numeric_only'])
   1305         try:
-> 1306             return self._cython_agg_general('mean', **kwargs)
   1307         except GroupByError:
   1308             raise

~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
   3972                             min_count=-1):
   3973         new_items, new_blocks = self._cython_agg_blocks(
-> 3974             how, alt=alt, numeric_only=numeric_only, min_count=min_count)
   3975         return self._wrap_agged_blocks(new_items, new_blocks)
   3976 

~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
   4044 
   4045         if len(new_blocks) == 0:
-> 4046             raise DataError('No numeric types to aggregate')
   4047 
   4048         # reset the locs in the blocks to correspond to our

DataError: No numeric types to aggregate
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-08-12 11:02:49

通过在pivot_table中使用Interval

代码语言:javascript
复制
#df.Time=pd.to_datetime(df.Time)

period2=pd.Interval(pd.Timestamp('1998-02-02'), pd.Timestamp('1998-05-02'), closed='both')


df['New']='period1'

df.loc[df.Time.apply(lambda x : x in period2),'New']='period2'

df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')
Out[881]: 
New   period1  period2
ID                    
1002     34.0     41.5
2001      NaN     45.0
2003      NaN     30.0
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/51805292

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档