import pandas as pd
li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)
df.set_index("employee_id",inplace=True)
print(df)提供:
project_handled
employee_id
1 pas
1 asap
2 trimm
2 fat我想要的是,在打印时索引值不应该重复:
project_handled
employee_id
1 pas
asap
2 trimm
fat我想序列化这一点,并分享为excel使用DataFrame.to_excel应用程序接口。并且要求索引不应该在employee_id列中重复自身。
发布于 2018-04-13 17:33:10
您需要设置MultiIndex
import pandas as pd
li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)
df['Something'] = 1
df.set_index(["employee_id", "project_handled"],inplace=True)
print(df)我添加了Something,因为否则你会得到:
Empty DataFrame
Columns: []
Index: [(1, pas), (1, asap), (2, trimm), (2, fat)]编辑
要在不使用project_handled的情况下创建它,需要使用空列和MultiIndex
df["another"] = ""
df.set_index(["employee_id", "another"],inplace=True)发布于 2018-04-13 19:53:52
如果您的唯一目标是以所需的方式将您的DataFrame打印到csv,并且您不需要为每个employee_id值只有一个单元格,那么您可以这样做:
import pandas as pd
li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)
def custom_func(x):
for i in range(1, x['employee_id'].size):
x['employee_id'].iloc[i] = ''
return x;
df['employee_id'] = df['employee_id'].apply(str)
df = df.groupby('employee_id').apply(custom_func).set_index('employee_id')
print(df)输出:
project_handled
employee_id
1 pas
asap
2 trimm
fatdf.to_csv('test.csv')的结果如下所示:

https://stackoverflow.com/questions/49813645
复制相似问题