我有一个数据框架:
Outlook Temperature PlayTennis Value
0 Sunny 60 Yes 1
1 Sunny 70 Yes 1
2 Sunny 40 No 1
3 Overcast 40 No 1
4 Overcast 60 Yes 1
5 Overcast 50 Yes 1
6 Overcast 70 Yes 1
7 Overcast 80 Yes 1
8 Rain 65 No 1
9 Rain 70 Yes 1我想要得到这个
Outlook Yes No
Sunny 2 1
Overcast 4 1
Rain 1 1不确定根据Sunny/多云/雨使用什么命令来计算yesses和nos
发布于 2015-04-22 16:26:45
这个怎么样?
df.groupby('Outlook').apply(lambda g: g['PlayTennis'].value_counts())或者,就你的确切规格而言:
df.groupby('Outlook').apply(lambda g: g['PlayTennis'].value_counts()).unstack(1)甚至更短:
df.groupby('Outlook')['PlayTennis'].value_counts().unstack(1)发布于 2015-04-22 16:20:57
这里有一些事情要从以下几个方面开始:
forecasts = [
["sunny", "yes"],
["sunny", "yes"],
["sunny", "no"],
["overcast", "no"],
# more forecasts ...
]
myForecasts = {}
for forecast in forecasts:
if forecast[0] not in myForecasts:
myForecasts[forecast[0]] = [0, 0]
if forecast[1] == "yes":
myForecasts[forecast[0]][0] += 1
else:
myForecasts[forecast[0]][1] += 1
print("Outlook | Yes | No")
for myForecast in myForecasts:
print("{} | {} | {}".format(myForecast, myForecasts[myForecast][0], myForecasts[myForecast][1]))我希望这能帮上忙。下一次,请告诉我们你已经做好了作业。
发布于 2015-04-22 16:27:51
您可以使用pd.pivot_table来解决这个问题。
In [88]: pd.pivot_table(df, index='Outlook', cols='PlayTennis',
values='Value', aggfunc='sum')
Out[88]:
PlayTennis No Yes
Outlook
Overcast 1 4
Rain 1 1
Sunny 1 2此外,您还可以在groupby上'Outlook', 'PlayTennis'上获取数据,获取计数并使用unstack('PlayTennis')
In [87]: df.groupby(['Outlook', 'PlayTennis']).size().unstack('PlayTennis')
Out[87]:
PlayTennis No Yes
Outlook
Overcast 1 4
Rain 1 1
Sunny 1 2https://stackoverflow.com/questions/29803291
复制相似问题