我有一个dataframe,它的列如下
ID Time Price
1002 1998-01-02 34
2001 1998-02-03 45
1002 1998-04-05 23
2003 1998-02-03 30
1002 1998-02-03 60基于这个数据帧,我想创建另一个数据帧,它有三列,“ID”,“period-1”,“period-2”。每条条目为对应时间段内ID的平均值)
新的数据帧应该如下所示。如何在Pandas中做到这一点?
ID period-1(1998-01-01:1998-02-01) period 2(1998-02-02-1998-05-02)
1002
2001
2003 下面是我按照建议得到的代码,但有一些错误
import pandas as pd
df=pd.DataFrame({"ID": ["1002", "2001", "1002", "2003", "1002"],
"Time": ["1998-01-02", "1998-02-03", "1998-04-05", "1998-02-03", "1998-02-03"],
"Price": ["34", "45", "23", "30","60"]})
df.Time=pd.to_datetime(df.Time)
period2=pd.Interval(pd.Timestamp('1998-02-02'), pd.Timestamp('1998-05-02'), closed='both')
df['Price'].apply(pd.to_numeric)
df['New']='period1'
df.loc[df.Time.apply(lambda x : x in period2),'New']='period2'
df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')
306 # people may try to aggregate on a non-callable attribute
~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in mean(self, *args, **kwargs)
1304 nv.validate_groupby_func('mean', args, kwargs, ['numeric_only'])
1305 try:
-> 1306 return self._cython_agg_general('mean', **kwargs)
1307 except GroupByError:
1308 raise
~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
3972 min_count=-1):
3973 new_items, new_blocks = self._cython_agg_blocks(
-> 3974 how, alt=alt, numeric_only=numeric_only, min_count=min_count)
3975 return self._wrap_agged_blocks(new_items, new_blocks)
3976
~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
4044
4045 if len(new_blocks) == 0:
-> 4046 raise DataError('No numeric types to aggregate')
4047
4048 # reset the locs in the blocks to correspond to our
DataError: No numeric types to aggregate发布于 2018-08-12 11:02:49
通过在pivot_table中使用Interval
#df.Time=pd.to_datetime(df.Time)
period2=pd.Interval(pd.Timestamp('1998-02-02'), pd.Timestamp('1998-05-02'), closed='both')
df['New']='period1'
df.loc[df.Time.apply(lambda x : x in period2),'New']='period2'
df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')
Out[881]:
New period1 period2
ID
1002 34.0 41.5
2001 NaN 45.0
2003 NaN 30.0https://stackoverflow.com/questions/51805292
复制相似问题