文章/答案/技术大牛

发布

社区首页 >问答首页 >大熊猫按独特价值分组和聚集

问大熊猫按独特价值分组和聚集
EN

Stack Overflow用户

提问于 2015-03-09 16:37:04

回答 1查看 84关注 0票数 1

在“熊猫v 012”中，我有下面的数据。

import pandas as pd
df = pd.DataFrame({'id' : range(1,9),
                        'code' : ['one', 'one', 'two', 'three',
                                    'two', 'three', 'one', 'two'],
                        'colour': ['black', 'white','white','white',
                                'black', 'black', 'white', 'white'],
                        'texture': ['soft', 'soft', 'hard','soft','hard',
                                            'hard','hard','hard'],
                        'shape': ['round', 'triangular', 'triangular','triangular','square',
                                            'triangular','round','triangular'],
                        'amount' : np.random.randn(8)},  columns= ['id','code','colour', 'texture', 'shape', 'amount'])

我可以“群由”code，如下所示：

c = df.groupby('code')

但是，如何才能得到与code有关的独特的code发生呢？我试过这个错误：

question = df.groupby('code').agg({'texture': pd.Series.unique}).reset_index()
#error: Must produce aggregated value

从上面给出的df中，我希望结果是一本字典，具体来说就是这个：

result = {'one':['soft','hard'], 'two':['hard'], 'three':['soft','hard']}

我的实际df的大小相当大，所以我需要解决方案的效率和速度。

python

pandas

dictionary

dataframe

unique

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-03-09 16:46:57

获得唯一值字典的一种方法是将pd.unique应用于groupby对象：

>>> df.groupby('code')['texture'].apply(pd.unique).to_dict()
{'one': array(['hard', 'soft'], dtype=object),
 'three': array(['hard', 'soft'], dtype=object),
 'two': array(['hard'], dtype=object)}

在新版本的熊猫中，unique是groupby对象的一种方法，因此更整洁的方法是：

df.groupby("code")["texture"].unique()

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/28947223

复制

相似问题

问大熊猫按独特价值分组和聚集
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问大熊猫按独特价值分组和聚集EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问大熊猫按独特价值分组和聚集
EN