我的dataframe有6行和1488列(6,1488),我需要对数据进行切片,这样所有的切片/块都是一个大小(6,22)。
所以我想在第22栏之后写个开始。最后,我想把所有这些切片放在另一个下面--所以我得到了一个大小的最后数据- (~405,22)
任何帮助都将不胜感激。
发布于 2021-12-15 17:43:57
我不知道你的数据文件到底是什么样子,但是像这样的东西应该能工作。
# create an example dataframe
df = pd.DataFrame(np.random.random((6, 1488)))
df
0 1 2 3 4 5 6 7 8 ... 1479 1480 1481 1482 1483 1484 1485 1486 1487
0 0.202945 0.764556 0.935441 0.811226 0.813502 0.218969 0.612307 0.501421 0.654886 ... 0.849323 0.179219 0.383729 0.453096 0.515090 0.042625 0.157411 0.738439 0.866627
1 0.284549 0.631829 0.562288 0.122613 0.678792 0.494868 0.896530 0.928943 0.740604 ... 0.212852 0.947779 0.993973 0.394951 0.678237 0.590767 0.690921 0.792253 0.748520
2 0.233059 0.349914 0.966794 0.005431 0.051786 0.002843 0.677197 0.557434 0.858027 ... 0.127492 0.324699 0.793800 0.327186 0.619923 0.871256 0.494916 0.487993 0.368654
3 0.862628 0.114289 0.663868 0.929045 0.796207 0.386012 0.097557 0.700127 0.719978 ... 0.535595 0.400371 0.375005 0.509740 0.412794 0.399939 0.414794 0.769017 0.591004
4 0.719133 0.130646 0.438649 0.921081 0.384160 0.393997 0.338588 0.120220 0.115953 ... 0.060460 0.297115 0.823037 0.299341 0.923836 0.111853 0.256940 0.344354 0.745989
5 0.686776 0.711688 0.232884 0.403817 0.311352 0.581365 0.942824 0.787317 0.212746 ... 0.049652 0.872466 0.437506 0.727937 0.119991 0.707848 0.178063 0.464412 0.587901
# create the 6x22 dataframes we will append together
# renaming is important so each chunks' columns match up with each other
chunks = [
df.iloc[:, i:i+22].rename(columns=lambda c: c % 22)
for i in range(0, 1488, 22)
]
final_df = pd.concat(chunks, ignore_index=True)
final_df
0 1 2 3 4 5 6 7 8 ... 13 14 15 16 17 18 19 20 21
0 0.202945 0.764556 0.935441 0.811226 0.813502 0.218969 0.612307 0.501421 0.654886 ... 0.683138 0.241730 0.127795 0.290902 0.342813 0.806268 0.739551 0.545052 0.485129
1 0.284549 0.631829 0.562288 0.122613 0.678792 0.494868 0.896530 0.928943 0.740604 ... 0.517114 0.937569 0.028149 0.097362 0.047555 0.755910 0.339539 0.513563 0.861521
2 0.233059 0.349914 0.966794 0.005431 0.051786 0.002843 0.677197 0.557434 0.858027 ... 0.335635 0.256579 0.547100 0.607310 0.925894 0.952812 0.999725 0.687252 0.465104
3 0.862628 0.114289 0.663868 0.929045 0.796207 0.386012 0.097557 0.700127 0.719978 ... 0.670078 0.593592 0.631335 0.917056 0.737024 0.932694 0.547243 0.514497 0.237268
4 0.719133 0.130646 0.438649 0.921081 0.384160 0.393997 0.338588 0.120220 0.115953 ... 0.213295 0.625206 0.570912 0.368144 0.715152 0.024020 0.400959 0.992156 0.328769
.. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
403 0.662156 0.909833 0.106109 0.630261 0.415084 0.212852 0.947779 0.993973 0.394951 ... 0.748520 NaN NaN NaN NaN NaN NaN NaN NaN
404 0.280660 0.324690 0.089441 0.695034 0.040087 0.127492 0.324699 0.793800 0.327186 ... 0.368654 NaN NaN NaN NaN NaN NaN NaN NaN
405 0.299956 0.111437 0.332434 0.312539 0.866787 0.535595 0.400371 0.375005 0.509740 ... 0.591004 NaN NaN NaN NaN NaN NaN NaN NaN
406 0.801716 0.993745 0.653756 0.415967 0.479453 0.060460 0.297115 0.823037 0.299341 ... 0.745989 NaN NaN NaN NaN NaN NaN NaN NaN
407 0.937215 0.811213 0.643623 0.686690 0.843001 0.049652 0.872466 0.437506 0.727937 ... 0.587901 NaN NaN NaN NaN NaN NaN NaN NaN如果您的dataframe列名不是像本例中那样的序列号,那么您需要自己的映射器,以便每个块中的列匹配起来。否则,concat操作将创建一个包含所有列名的超集的dataframe。
https://stackoverflow.com/questions/70367746
复制相似问题