文章/答案/技术大牛

发布

社区首页 >问答首页 >使用Pandas Dataframe包含bytearray对象的groupby on列

问使用Pandas Dataframe包含bytearray对象的groupby on列
EN

Stack Overflow用户

提问于 2021-12-23 18:41:33

回答 1查看 69关注 0票数 0

我有熊猫的资料，我想用客户身份证做群客

 df['rank_col'] = df.groupby('PSEUDO_CUSTOMER_ID')['DB_CREATED_DT'].rank(method='first')

现在的问题是pseudo_customer_ID，它看起来像这样

 [138, 76, 16, 9, 86, 71, 5, 85, 117, 237, 97, 212, 13, 157, 185, 150, 207, 97, 85, 165]

下面是我在值依赖伪客户ID时的快照，

我检查我得到的值低于值的值

注:我想在pseudo_customer_ID上做groupby，按DB_CREATED_DT列做排名

python

arrays

pandas

pandas-groupby

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-12-23 20:31:24

将bytearray与bytes函数转换为允许分组(并获得可接受的类型)：

演示：

df['PSEUDO_CUSTOMER_ID_BYTES'] = df['PSEUDO_CUSTOMER_ID'].apply(bytes)
print(df)

# Output:
                                  PSEUDO_CUSTOMER_ID                           PSEUDO_CUSTOMER_ID_BYTES
0  [138, 76, 16, 9, 86, 71, 5, 85, 117, 237, 97, ...  b'\x8aL\x10\tVG\x05Uu\xeda\xd4\r\x9d\xb9\x96\x...

PSEUDO_CUSTOMER_ID组

>>> list(df.groupby('PSEUDO_CUSTOMER_ID'))
...
TypeError: unhashable type: 'bytearray'

PSEUDO_CUSTOMER_ID_BYTES组

>>> list(df.groupby('PSEUDO_CUSTOMER_ID_BYTES'))

[(b'\x8aL\x10\tVG\x05Uu\xeda\xd4\r\x9d\xb9\x96\xcfaU\xa5',
                                    PSEUDO_CUSTOMER_ID                           PSEUDO_CUSTOMER_ID_BYTES
  0  [138, 76, 16, 9, 86, 71, 5, 85, 117, 237, 97, ...  b'\x8aL\x10\tVG\x05Uu\xeda\xd4\r\x9d\xb9\x96\x...)]

重要

如果您确信您的原始编码，您可以使用str.decode获得一个str，而不是一个bytes字符串。这里似乎是latin-1

df['PSEUDO_CUSTOMER_ID_STR'] = df['PSEUDO_CUSTOMER_ID'].decode('latin1'))
print(df.loc[0])

# Output:
PSEUDO_CUSTOMER_ID          [138, 76, 16, 9, 86, 71, 5, 85, 117, 237, 97, ...
PSEUDO_CUSTOMER_ID_BYTES    b'\x8aL\x10\tVG\x05Uu\xeda\xd4\r\x9d\xb9\x96\x...
PSEUDO_CUSTOMER_ID_STR                                 L\tVGUuíaÔ\rÏaU¥
Name: 0, dtype: object

演示：

>>> list(df.groupby('PSEUDO_CUSTOMER_ID_STR'))

[('\x8aL\x10\tVG\x05UuíaÔ\r\x9d¹\x96ÏaU¥',
                                    PSEUDO_CUSTOMER_ID                           PSEUDO_CUSTOMER_ID_BYTES  PSEUDO_CUSTOMER_ID_STR
  0  [138, 76, 16, 9, 86, 71, 5, 85, 117, 237, 97, ...  b'\x8aL\x10\tVG\x05Uu\xeda\xd4\r\x9d\xb9\x96\x...  L\tVGUuíaÔ\rÏaU¥)]

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70466211

复制

相似问题

问使用Pandas Dataframe包含bytearray对象的groupby on列
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Pandas Dataframe包含bytearray对象的groupby on列EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Pandas Dataframe包含bytearray对象的groupby on列
EN