我有一个问题:我有一个1000多列的大数据。
例如:2019年材料成本,2019年人工成本,2019年间接成本,2020年材料成本,2020年人工成本,2020年间接成本,...2035
df = pd.DataFrame({'2019 Material cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2019 Overhead cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2019 Labor cost': [11, 8, 10, 6, 6, 5, 9, 12],
'2020 Material cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2020 Overhead cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2020 Labor cost': [11, 8, 10, 6, 6, 5, 9, 12],
'2021 Material cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2021 Overhead cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2021 Labor cost': [11, 8, 10, 6, 6, 5, 9, 12],
})我想将所有标题排序如下:
2019年材料成本,2020年材料成本,2021年材料成本,...,2019人工成本,2020年人工成本,2021年人工成本,……,2019年间接成本,2020年间接成本,2021年间接成本。
df = pd.DataFrame({'2019 Material cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2020 Material cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2021 Material cost': [11, 8, 10, 6, 6, 5, 9, 12],
'2019 Overhead cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2020 Overhead cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2021 Overhead cost': [11, 8, 10, 6, 6, 5, 9, 12],
'2019 Labor cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2020 Labor cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2021 Labor cost': [11, 8, 10, 6, 6, 5, 9, 12],
})因此,我希望有一个成本类别,并对类别的年数按以下顺序进行排序,然后是下一个类别。
这里有什么帮助吗?提前感谢
发布于 2022-09-13 13:14:44
创建两个列表,一个列出成本,另一个列出年份。使用这些列表,您可以创建另一个包含所有列名的列表(按顺序排列)。
costs = list(df.columns.str[5:].unique())
years = list(range(2019, 2036))
columns = [str(year) + ' ' + cost for year in years for cost in costs]
df = df.reindex(columns=columns)例如:
df = pd.DataFrame(np.random.random((10, 10)), columns = ['1 a', '2 a', '3 a', '4 a', '5 a', '1 b', '2 b', '3 b', '4 b', '5 b'])
costs = ['a', 'b']
years = [1, 2, 3, 4, 5]
columns = [str(year) + ' ' + cost for year in years for cost in costs]
df.reindex(columns=columns).columns返回
Index(['1 a', '1 b', '2 a', '2 b', '3 a', '3 b', '4 a', '4 b', '5 a', '5 b'], dtype='object')发布于 2022-09-13 13:19:48
@Chris给出了输入:
df = pd.DataFrame({'2019 Material cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2019 Overhead cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2019 Labor cost': [11, 8, 10, 6, 6, 5, 9, 12],
'2020 Material cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2020 Overhead cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2020 Labor cost': [11, 8, 10, 6, 6, 5, 9, 12],
'2021 Material cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2021 Overhead cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2021 Labor cost': [11, 8, 10, 6, 6, 5, 9, 12],
})我希望将其作为输出(按类别排序,并按年份进行升序):
df = pd.DataFrame({'2019 Material cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2020 Material cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2021 Material cost': [11, 8, 10, 6, 6, 5, 9, 12],
'2019 Overhead cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2020 Overhead cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2021 Overhead cost': [11, 8, 10, 6, 6, 5, 9, 12],
'2019 Labor cost': [25, 12, 15, 14, 19, 23, 25, 29],
'2020 Labor cost ': [5, 7, 7, 9, 12, 9, 9, 4],
'2021 Labor cost': [11, 8, 10, 6, 6, 5, 9, 12],
})https://stackoverflow.com/questions/73703571
复制相似问题