我有一只Python熊猫的数据,它看起来像这样:
print(dataframe1.head(30))
Date Close* month_initial month day year
date_final
2022-09-23 Sep 23, 2022 3693.23 Sep 9.0 23 2022
2022-09-22 Sep 22, 2022 3757.99 Sep 9.0 22 2022
2022-09-21 Sep 21, 2022 3789.93 Sep 9.0 21 2022
2022-09-20 Sep 20, 2022 3855.93 Sep 9.0 20 2022
2022-09-19 Sep 19, 2022 3899.89 Sep 9.0 19 2022
2022-09-16 Sep 16, 2022 3873.33 Sep 9.0 16 2022
2022-09-15 Sep 15, 2022 3901.35 Sep 9.0 15 2022
2022-09-14 Sep 14, 2022 3946.01 Sep 9.0 14 2022
2022-09-13 Sep 13, 2022 3932.69 Sep 9.0 13 2022
2022-09-12 Sep 12, 2022 4110.41 Sep 9.0 12 2022
2022-09-09 Sep 09, 2022 4067.36 Sep 9.0 09 2022
2022-09-08 Sep 08, 2022 4006.18 Sep 9.0 08 2022
2022-09-07 Sep 07, 2022 3979.87 Sep 9.0 07 2022
2022-09-06 Sep 06, 2022 3908.19 Sep 9.0 06 2022
2022-09-02 Sep 02, 2022 3924.26 Sep 9.0 02 2022
2022-09-01 Sep 01, 2022 3966.85 Sep 9.0 01 2022
2022-08-31 Aug 31, 2022 3955.00 Aug 8.0 31 2022
2022-08-30 Aug 30, 2022 3986.16 Aug 8.0 30 2022
2022-08-29 Aug 29, 2022 4030.61 Aug 8.0 29 2022
2022-08-26 Aug 26, 2022 4057.66 Aug 8.0 26 2022
2022-08-25 Aug 25, 2022 4199.12 Aug 8.0 25 2022
2022-08-24 Aug 24, 2022 4140.77 Aug 8.0 24 2022
2022-08-23 Aug 23, 2022 4128.73 Aug 8.0 23 2022
2022-08-22 Aug 22, 2022 4137.99 Aug 8.0 22 2022
2022-08-19 Aug 19, 2022 4228.48 Aug 8.0 19 2022
2022-08-18 Aug 18, 2022 4283.74 Aug 8.0 18 2022
2022-08-17 Aug 17, 2022 4274.04 Aug 8.0 17 2022
2022-08-16 Aug 16, 2022 4305.20 Aug 8.0 16 2022
2022-08-15 Aug 15, 2022 4297.14 Aug 8.0 15 2022
2022-08-12 Aug 12, 2022 4280.15 Aug 8.0 12 2022我想每个月保留第一排和最后一排。我怎么能这么做?我尝试使用以下代码:
import pandas as pd
dataframe1.set_index("date_final", inplace=True)
resultDf = dataframe1.groupby([dataframe1.index.year, dataframe1.index.month]).agg(["first", "last"])
resultDf.index.rename(["year", "month"], inplace=True)
resultDf.reset_index(inplace=True)
resultDf但我没有得到我想要的结果。
发布于 2022-09-26 15:39:06
pandas groupby操作不会在聚合之前对每个组进行排序,这就是为什么'first'和'last'没有为您选择正确的行。
此外,您可以在年月使用.resample('M')而不是groupby。
out = (
df.set_index(df.index.astype('datetime64[ns]')) # copying in the data, I lost the datetime index
.sort_index() # sort ensures first and last work as expected
.resample('M') # resample for a shorthand year/month grouping
.agg(['first', 'last'])
)
print(out)
Date Close* month_initial month day year
first last first last first last first last first last first last
date_final
2022-08-31 Aug 12, 2022 Aug 31, 2022 4280.15 3955.00 Aug Aug 8.0 8.0 12 31 2022 2022
2022-09-30 Sep 01, 2022 Sep 23, 2022 3966.85 3693.23 Sep Sep 9.0 9.0 1 23 2022 2022这个输出没有最可用的格式,所以我们可以使用一个快速的.stack来纠正它:
out = out.stack()
print(out)
Date Close* month_initial month day year
date_final
2022-08-31 first Aug 12, 2022 4280.15 Aug 8.0 12 2022
last Aug 31, 2022 3955.00 Aug 8.0 31 2022
2022-09-30 first Sep 01, 2022 3966.85 Sep 9.0 1 2022
last Sep 23, 2022 3693.23 Sep 9.0 23 2022发布于 2022-09-26 15:44:37
您可以使用is_month_start和is_month_end函数
在您的例子中,以下内容应该可以工作
dataframe1.set_index("date_final", inplace=True)
dataframe1 = dataframe1[dataframe1.index.is_month_start | dataframe1.index.is_month_end]https://stackoverflow.com/questions/73856169
复制相似问题