转换大型datetime64[D] Series (即DataFrame列的900 k行)花费的时间太长。我怎样才能加快速度?
import pandas as pd
df = pd.DataFrame(['2021-10-01']*900000, columns=['date']) # 0.025286900 seconds
df = df.assign(date=df['date'].astype('datetime64[D]')) # 0.105065900
# Why is converting from datetime to str so slow?
df.assign(date=df['date'].dt.strftime('%Y-%m-%d')) # 5.600835100 seconds.
txt = str(df) # 0.006202600
# Converting the entire DataFrame to a str is much faster
# than converting a column directly, despite a similar display format!有a related question,它询问如何快速地从str转换到datetime。但我的瓶颈是(令人惊讶的)相反的;从datetime[D]到str的转换太慢了。
发布于 2022-03-16 00:54:55
下面是一个更快的解决方案:
np.datetime_as_string(df['date'], unit='D') # 0.375799800 seconds.它仍然比应该花费的时间长(从datetime转换比从str转换慢有什么意义),但它要快得多。
https://stackoverflow.com/questions/71490391
复制相似问题