我有一份不同商品的销售档案。我有一列模型预测,名为outputs,哪个模型产生了这些预测,名为model。我需要将名为“FCAST_QTYand make them part of the outputs column with the string "FCAST_QTY" being put into the model column. So essentially, melting that column (FCAST_QTY”的列中当前模型的预测(即生产中的预测)转换为列output和model,以便当前的生产模型位于相同的列和正在开发的多个模型中。这将使比较/对比更加容易。我不知道该怎么做。下面的示例数据。
import pandas as pd
from pandas import Timestamp
sales_dict = {'MO_YR': {0: Timestamp('2021-01-01 00:00:00'),
1: Timestamp('2021-02-01 00:00:00'),
2: Timestamp('2021-03-01 00:00:00'),
3: Timestamp('2021-04-01 00:00:00'),
4: Timestamp('2021-05-01 00:00:00'),
5: Timestamp('2021-06-01 00:00:00'),
6: Timestamp('2021-07-01 00:00:00'),
7: Timestamp('2021-08-01 00:00:00'),
8: Timestamp('2021-09-01 00:00:00'),
9: Timestamp('2021-10-01 00:00:00'),
10: Timestamp('2021-11-01 00:00:00'),
11: Timestamp('2021-12-01 00:00:00'),
12: Timestamp('2021-01-01 00:00:00'),
13: Timestamp('2021-02-01 00:00:00'),
14: Timestamp('2021-03-01 00:00:00')},
'ITEM_BASE': {0: '289461K',
1: '289461K',
2: '289461K',
3: '289461K',
4: '289461K',
5: '289461K',
6: '289461K',
7: '289461K',
8: '289461K',
9: '289461K',
10: '289461K',
11: '289461K',
12: '400520J',
13: '400520J',
14: '400520J'},
'eaches': {0: 2592,
1: 3844,
2: 759,
3: 825,
4: 663,
5: 2025,
6: 471,
7: 1160,
8: 5987,
9: 679,
10: 469,
11: 907,
12: 64,
13: 48,
14: 63},
'FCAST_QTY': {0: 2800.0,
1: 5200.0,
2: 550.0,
3: 475.0,
4: 575.0,
5: 475.0,
6: 650.0,
7: 550.0,
8: 7900.0,
9: 1187.0,
10: 1187.0,
11: 1900.0,
12: 51.0,
13: 55.0,
14: 59.0},
'log_eaches': {0: 7.860185057472165,
1: 8.254268770090183,
2: 6.63200177739563,
3: 6.715383386334682,
4: 6.496774990185863,
5: 7.61332497954064,
6: 6.154858094016418,
7: 7.05617528410041,
8: 8.697345730925353,
9: 6.520621127558696,
10: 6.15060276844628,
11: 6.810142450115136,
12: 4.158883083359672,
13: 3.871201010907891,
14: 4.143134726391533},
'output': {0: 8.646015798513993,
1: 8.378197900630752,
2: 7.045235414873291,
3: 5.117058321275769,
4: 9.082928370640056,
5: 5.225648643174155,
6: 7.446383013291042,
7: 6.307484284901181,
8: 7.752673979530179,
9: 9.02189934080111,
10: 4.677594714421006,
11: 7.218749101888444,
12: 4.04018241973268,
13: 3.940978322900716,
14: 3.962359464699719},
'model': {0: 'LR_output',
1: 'LR_output',
2: 'LR_output',
3: 'LR_output',
4: 'LR_output',
5: 'LR_output',
6: 'LR_output',
7: 'LR_output',
8: 'LR_output',
9: 'LR_output',
10: 'LR_output',
11: 'LR_output',
12: 'AR(12)',
13: 'AR(12)',
14: 'AR(12)'}}
df = pd.DataFrame.from_dict(sales_dict)预期输出增加:
expected_dict = {'MO_YR': {0: Timestamp('2021-01-01 00:00:00'),
1: Timestamp('2021-02-01 00:00:00'),
2: Timestamp('2021-03-01 00:00:00'),
3: Timestamp('2021-04-01 00:00:00'),
4: Timestamp('2021-05-01 00:00:00'),
5: Timestamp('2021-06-01 00:00:00'),
6: Timestamp('2021-07-01 00:00:00'),
7: Timestamp('2021-08-01 00:00:00'),
8: Timestamp('2021-09-01 00:00:00'),
9: Timestamp('2021-10-01 00:00:00'),
10: Timestamp('2021-11-01 00:00:00'),
11: Timestamp('2021-12-01 00:00:00'),
12: Timestamp('2021-01-01 00:00:00'),
13: Timestamp('2021-02-01 00:00:00'),
14: Timestamp('2021-03-01 00:00:00'),
15: Timestamp('2021-01-01 00:00:00'),
16: Timestamp('2021-02-01 00:00:00'),
17: Timestamp('2021-03-01 00:00:00'),
18: Timestamp('2021-04-01 00:00:00'),
19: Timestamp('2021-05-01 00:00:00'),
20: Timestamp('2021-06-01 00:00:00'),
21: Timestamp('2021-07-01 00:00:00'),
22: Timestamp('2021-08-01 00:00:00'),
23: Timestamp('2021-09-01 00:00:00'),
24: Timestamp('2021-10-01 00:00:00'),
25: Timestamp('2021-11-01 00:00:00'),
26: Timestamp('2021-12-01 00:00:00'),
27: Timestamp('2021-01-01 00:00:00'),
28: Timestamp('2021-02-01 00:00:00'),
29: Timestamp('2021-03-01 00:00:00')},
'ITEM_BASE': {0: '289461K',
1: '289461K',
2: '289461K',
3: '289461K',
4: '289461K',
5: '289461K',
6: '289461K',
7: '289461K',
8: '289461K',
9: '289461K',
10: '289461K',
11: '289461K',
12: '400520J',
13: '400520J',
14: '400520J',
15: '289461K',
16: '289461K',
17: '289461K',
18: '289461K',
19: '289461K',
20: '289461K',
21: '289461K',
22: '289461K',
23: '289461K',
24: '289461K',
25: '289461K',
26: '289461K',
27: '400520J',
28: '400520J',
29: '400520J'},
'eaches': {0: 2592,
1: 3844,
2: 759,
3: 825,
4: 663,
5: 2025,
6: 471,
7: 1160,
8: 5987,
9: 679,
10: 469,
11: 907,
12: 64,
13: 48,
14: 63,
15: 2592,
16: 3844,
17: 759,
18: 825,
19: 663,
20: 2025,
21: 471,
22: 1160,
23: 5987,
24: 679,
25: 469,
26: 907,
27: 64,
28: 48,
29: 63},
'log_eaches': {0: 7.860185057472165,
1: 8.254268770090183,
2: 6.63200177739563,
3: 6.715383386334682,
4: 6.496774990185863,
5: 7.61332497954064,
6: 6.154858094016418,
7: 7.05617528410041,
8: 8.697345730925353,
9: 6.520621127558696,
10: 6.15060276844628,
11: 6.810142450115136,
12: 4.158883083359672,
13: 3.871201010907891,
14: 4.143134726391533,
15: 7.860185057472165,
16: 8.254268770090183,
17: 6.63200177739563,
18: 6.715383386334682,
19: 6.496774990185863,
20: 7.61332497954064,
21: 6.154858094016418,
22: 7.05617528410041,
23: 8.697345730925353,
24: 6.520621127558696,
25: 6.15060276844628,
26: 6.810142450115136,
27: 4.158883083359672,
28: 3.871201010907891,
29: 4.143134726391533,},
'output': {0: 8.646015798513993,
1: 8.378197900630752,
2: 7.045235414873291,
3: 5.117058321275769,
4: 9.082928370640056,
5: 5.225648643174155,
6: 7.446383013291042,
7: 6.307484284901181,
8: 7.752673979530179,
9: 9.02189934080111,
10: 4.677594714421006,
11: 7.218749101888444,
12: 4.04018241973268,
13: 3.940978322900716,
14: 3.962359464699719,
15: 2800.0,
16: 5200.0,
17: 550.0,
18: 475.0,
19: 575.0,
20: 475.0,
21: 650.0,
22: 550.0,
23: 7900.0,
24: 1187.0,
25: 1187.0,
26: 1900.0,
27: 51.0,
28: 55.0,
29: 59.0},
'model': {0: 'LR_output',
1: 'LR_output',
2: 'LR_output',
3: 'LR_output',
4: 'LR_output',
5: 'LR_output',
6: 'LR_output',
7: 'LR_output',
8: 'LR_output',
9: 'LR_output',
10: 'LR_output',
11: 'LR_output',
12: 'AR(12)',
13: 'AR(12)',
14: 'AR(12)',
15:'FCAST_QTY',
16:'FCAST_QTY',
17:'FCAST_QTY',
18:'FCAST_QTY',
19:'FCAST_QTY',
20:'FCAST_QTY',
21:'FCAST_QTY',
22:'FCAST_QTY',
23:'FCAST_QTY',
24:'FCAST_QTY',
25:'FCAST_QTY',
26:'FCAST_QTY',
27:'FCAST_QTY',
28:'FCAST_QTY',
29:'FCAST_QTY'}}
df = pd.DataFrame.from_dict(expected_dict)发布于 2022-09-20 21:52:06
使用您的逻辑创建一个新的dataframe,并将其附加到原始的dataframe中:
fcast_qty = (df
.drop(columns = ['output', 'model'])
.rename(columns={"FCAST_QTY":"output"})
.assign(model="FCAST_QTY")
)
pd.concat([df.drop(columns='FCAST_QTY'), fcast_qty], ignore_index = True)
MO_YR ITEM_BASE eaches log_eaches output model
0 2021-01-01 289461K 2592 7.860185 8.646016 LR_output
1 2021-02-01 289461K 3844 8.254269 8.378198 LR_output
2 2021-03-01 289461K 759 6.632002 7.045235 LR_output
3 2021-04-01 289461K 825 6.715383 5.117058 LR_output
4 2021-05-01 289461K 663 6.496775 9.082928 LR_output
5 2021-06-01 289461K 2025 7.613325 5.225649 LR_output
6 2021-07-01 289461K 471 6.154858 7.446383 LR_output
7 2021-08-01 289461K 1160 7.056175 6.307484 LR_output
8 2021-09-01 289461K 5987 8.697346 7.752674 LR_output
9 2021-10-01 289461K 679 6.520621 9.021899 LR_output
10 2021-11-01 289461K 469 6.150603 4.677595 LR_output
11 2021-12-01 289461K 907 6.810142 7.218749 LR_output
12 2021-01-01 400520J 64 4.158883 4.040182 AR(12)
13 2021-02-01 400520J 48 3.871201 3.940978 AR(12)
14 2021-03-01 400520J 63 4.143135 3.962359 AR(12)
15 2021-01-01 289461K 2592 7.860185 2800.000000 FCAST_QTY
16 2021-02-01 289461K 3844 8.254269 5200.000000 FCAST_QTY
17 2021-03-01 289461K 759 6.632002 550.000000 FCAST_QTY
18 2021-04-01 289461K 825 6.715383 475.000000 FCAST_QTY
19 2021-05-01 289461K 663 6.496775 575.000000 FCAST_QTY
20 2021-06-01 289461K 2025 7.613325 475.000000 FCAST_QTY
21 2021-07-01 289461K 471 6.154858 650.000000 FCAST_QTY
22 2021-08-01 289461K 1160 7.056175 550.000000 FCAST_QTY
23 2021-09-01 289461K 5987 8.697346 7900.000000 FCAST_QTY
24 2021-10-01 289461K 679 6.520621 1187.000000 FCAST_QTY
25 2021-11-01 289461K 469 6.150603 1187.000000 FCAST_QTY
26 2021-12-01 289461K 907 6.810142 1900.000000 FCAST_QTY
27 2021-01-01 400520J 64 4.158883 51.000000 FCAST_QTY
28 2021-02-01 400520J 48 3.871201 55.000000 FCAST_QTY
29 2021-03-01 400520J 63 4.143135 59.000000 FCAST_QTYhttps://stackoverflow.com/questions/73789471
复制相似问题