我正在努力解决如何使用多索引列枢轴的数据。首先,我从一个.xlsx文件导入数据,然后尝试生成一个特定的Dataframe。
注意:我不允许嵌入图像,所以这就是链接的原因
import pandas as pd
import numpy as np
# Read Excel file
df = pd.read_excel("myFile.xlsx", header=[0])输出:点击
如果您愿意,您可以在这里看到文件:链接到文件
# Get Month and Year of the Dataframe
month_year = df.iloc[:, 5:-1].columns
month_list = []
year_list = []
for x in range(len(month_year)-1):
if("Unnamed" not in month_year[x]):
month_list.append(month_year[x].split()[0])
year_list.append(month_year[x].split()[1])
# Read Excel file with headers 1 & 2
df = pd.read_excel(path, header=[0,1])输出:点击
# Join both indexes excluding the ones with Unnamed
df.columns = [str(x[0] + " " + x[1]) if("Unnamed" not in x[1]) else str(x[0]) for x in df.columns ]输出:点击
# Adding month and list columns to the DataFrame
df['Month'] = month_list
df['Year'] = year_list输出:点击
我希望输出DataFrame如下所示:欲望输出
发布于 2021-04-05 21:57:37
您应该清理一下它,因为我不知道应该如何处理Total列。
下面的代码将excel文件读入为一个MultiIndex,在堆叠和提取年份和月份列之前,修改一下列名。
df = pd.read_excel("Downloads/myFile.xlsx", header=[0,1], index_col=[0, 1, 2])
df.index.names = ['Project', 'KPI', 'Metric']
df.columns = df.columns.delete(-1).union([('Total', 'Total')])
df.columns.names = ['Month_Year', 'Values']
(df
.stack(level = 0)
.rename_axis(columns=None)
.reset_index()
.assign(Year = lambda df: df.Month_Year.str.split(" ").str[-1],
Month = lambda df: df.Month_Year.str.split(" ").str[0])
.drop(columns='Month_Year')
)
Project KPI Metric Real Target Total Year Month
0 Project 1 Numeric Project 1 Metric 10.0 30.0 NaN 2019 April
1 Project 1 Numeric Project 1 Metric 651.0 51651.0 NaN 2019 February
2 Project 1 Numeric Project 1 Metric 200.0 215.0 NaN 2019 January
3 Project 1 Numeric Project 1 Metric 2.0 5.0 NaN 2019 March
4 Project 1 Numeric Project 1 Metric NaN NaN 9.0 Total Total
5 Project 2 General Project 2 Metric 20.0 10.0 NaN 2019 April
6 Project 2 General Project 2 Metric 500.0 100.0 NaN 2019 February
7 Project 2 General Project 2 Metric 749.0 12.0 NaN 2019 January
8 Project 2 General Project 2 Metric 1.0 7.0 NaN 2019 March
9 Project 2 General Project 2 Metric NaN NaN 7.0 Total Total
10 Project 3 Numeric Project 3 Metric 30.0 20.0 NaN 2019 April
11 Project 3 Numeric Project 3 Metric 200.0 55.0 NaN 2019 February
12 Project 3 Numeric Project 3 Metric 5583.0 36.0 NaN 2019 January
13 Project 3 Numeric Project 3 Metric 3.0 7.0 NaN 2019 March
14 Project 3 Numeric Project 3 Metric NaN NaN 4.0 Total Totalhttps://stackoverflow.com/questions/66959990
复制相似问题