我正试图在熊猫身上取得一些成果,这在Excel PivotTable中是非常简单的:



据我所见,下面的代码似乎是逻辑的,但它不起作用。最后,我想知道实现这样一个简单的聚合会有多复杂。有什么建议吗?
pt = pd.pivot_table(data=df,
aggfunc = 'count',
index = ["root_name", "rca"],
values = ["rca"],
margins = True).sort_values(['rca'],
ascending=[False])编辑:示例输入数据和输出
try: from io import StringIO # Python 3
except: from StringIO import StringIO # Python 2
import pandas as pd
TESTDATA = u"""root_name;rca
Mobile Voice;mib manual manipulation
Mobile Voice;mib manual manipulation
Internet;dq
Mobile Voice;defect
Internet;mnp
Mobile Voice;mnp
Mobile Voice;defect
Mobile Voice;ceased in mib before dqt run
Mobile Voice;mnp
Mobile Voice;ceased in mib before dqt run
Internet;dq
Mobile Voice;mnp
Mobile Voice;dq
Mobile Voice;no dq
Mobile Voice;no dq
Mobile Voice;asset ceased while order was pending
Internet;dq
Mobile Voice;no dq
Internet;mnp
Mobile Voice;mnp
Mobile Voice;salto replication delay
Mobile Voice;provide order created dq
Internet;mnp
Mobile Voice;mib manual manipulation
Mobile Voice;mnp
Mobile Voice;mnp
Mobile Voice;ceased in mib before dqt run
Mobile Voice;mnp
Mobile Voice;mib manual manipulation
"""
df = pd.read_csv(StringIO(TESTDATA), sep=';', usecols= ['root_name', 'rca'], engine='python')
pt = pd.pivot_table(data=df,
aggfunc = 'count',
index = ["root_name", "rca"],
values = ["rca"],
margins = True)
print (pt.sort_values(['rca'],
ascending=[False]))结果:空的DataFrame列:[]索引:(移动语音,提供创建的命令dq),(移动语音,无dq),(因特网,mnp),(移动语音,移动语音,mnp),(移动语音,mib手动操作),(因特网,dq),(移动语音,dq),(移动语音,缺陷),(移动语音,在dqt运行前已在mib中停止),(移动语音,在等待订单期间停止资产),(所有,)
发布于 2022-05-13 15:28:21
尝试将“count”字段添加到数据帧中,然后使用count()方法进行分组:
df['count'] = 1
df.groupby(by=['root_name', 'rca']).count().sort_index('rca')输出:
count
root_name rca
Internet dq 3
mnp 3
Mobile Voice asset ceased while order was pending 1
ceased in mib before dqt run 3
defect 2
dq 1
mib manual manipulation 4
mnp 7
no dq 3
provide order created dq 1
salto replication delay 1通过添加"root_name“sum字段来复制excel结果的小调整:
grouped_sum = df.groupby(by='root_name').sum().reset_index(level=[0])
grouped = df.merge(grouped_sum, how='left', on='root_name')
grouped.rename(columns={'count_x': 'count', 'count_y': 'sum'}, inplace=True)
grouped
root_name rca count sum
0 Mobile Voice mib manual manipulation 1 23
1 Mobile Voice mib manual manipulation 1 23
2 Internet dq 1 6
pd.pivot_table(
data=grouped,
aggfunc=['count'],
index=[ "root_name", "sum", "rca"],
values=["count"],
margins=True).sort_values(["sum", 'root_name', 'rca'],
ascending=[False, True, True]
)输出:
count
count
root_name sum rca
All 29
Mobile Voice 23 asset ceased while order was pending 1
ceased in mib before dqt run 3
defect 2
dq 1
mib manual manipulation 4
mnp 7
no dq 3
provide order created dq 1
salto replication delay 1
Internet 6 dq 3
mnp 3https://stackoverflow.com/questions/72231484
复制相似问题