如果每个行业每个季度的数据点数少于4个,我将尝试使用NS显示。
这是我的数据源:
print df
IndustryGroup Year_ Quarter TotalRaisedUSD
Healthcare 2014_ 1 79.30000
Consumer Services 2014_ 1 111.25000
Information Technology 2014_ 1 232.00000
Healthcare 2014_ 1 113.90000
Healthcare 2014_ 1 121.11000
Healthcare 2014_ 1 108.76000
Healthcare 2014_ 1 58.64000
Healthcare 2014_ 1 120.07000
Healthcare 2014_ 1 81.76000
Healthcare 2014_ 1 28.40000
Healthcare 2014_ 1 76.63000
Healthcare 2014_ 1 74.96217
Healthcare 2014_ 1 57.86000
Healthcare 2014_ 1 220.23000
... ... ... ...
Healthcare 2014_ 4 109.70000
Consumer Services 2014_ 4 358.50000
Healthcare 2014_ 4 115.00000
Information Technology 2014_ 4 168.05000
Business and Financial Services 2014_ 4 100.66000
Healthcare 2014_ 4 72.09000
Healthcare 2014_ 4 129.67000
Healthcare 2014_ 4 151.00000
Healthcare 2014_ 4 123.28000
Healthcare 2014_ 4 153.85000
Business and Financial Services 2014_ 4 47.41000
Healthcare 2014_ 4 35.34000
Healthcare 2014_ 4 206.50000
Healthcare 2014_ 4 31.00000
Healthcare 2014_ 4 68.09000
Healthcare 2014_ 4 122.02000
Business and Financial Services 2014_ 4 193.22000
Information Technology 2014_ 4 254.34000
Information Technology 2014_ 4 196.50000
df1=pd.pivot_table(df,values='TotalRaisedUSD',index='IndustryGroup',columns=['Year_','Quarter'], np.median)我需要相同的df1输出,但如果TotalRaisedUSD的计数小于4,则显示"NS“(例如,信息技术2014_ 4应该显示NS)。
[108 rows x 4 columns]
Year_ 2014_
Quarter 1 2 3 4
IndustryGroup
Business and Financial Services 49.73 71.275 38.00 100.66
Consumer Services 111.25 165.600 NaN 358.50
Healthcare 87.10 82.335 84.53 118.51
Industrial Goods and Materials NaN 144.490 NaN NaN
Information Technology 82.13 68.000 55.93 196.50有什么想法吗?
谢谢!!
发布于 2016-05-19 13:55:19
您可以使用groupby和transform来完成这一任务:
grouped = df.groupby(['IndustryGroup','Year'])
logic= lambda x: 'NS' if x.count() < 4 else ''
transformed = grouped.transform(logic)我认为像这样的事情应该能奏效。
你可以在这里读到:http://pandas.pydata.org/pandas-docs/stable/groupby.html
https://stackoverflow.com/questions/37320945
复制相似问题