我有一个数据帧,看起来像这样:
datetime price tickvol bid ask
0 2016-10-11 12:24:03 2130.25 1 2130.00 2130.25
1 2016-10-11 13:31:03 2130.25 1 2130.00 2130.25
...我有一个看起来像这样的CustomBusinessHour:
cbh = CustomBusinessHour(start='13:30', end='13:15', weekmask='Sun Mon Tue Wed Thu')我希望我可以使用自定义营业时间的开始时间戳创建一个新的索引级别,但我在运行时遇到了问题。
我希望得到的结果是:
cbh datetime price tickvol bid ask
2016-10-10 13:30:00 2016-10-11 12:24:03 2130.25 1 2130.00 2130.25
2016-10-11 13:30:00 2016-10-11 13:31:03 2130.25 1 2130.00 2130.25发布于 2016-10-26 11:16:06
这就是我最终要做的。它是有效的,但可能还可以改进。看起来CustomBusinessHour并没有直接公开任何方法来确定时间是到了还是到了。
def session_start(ts, cbh):
"""Given a timestamp and a CustomBusinessHour, return the session start
timestamp"""
assert type(ts) == pd.Timestamp
spans = spans_midnight(cbh)
t = ts.time()
if spans:
if cbh.end <= t < cbh.start:
return pd.NaT
elif t < cbh.start:
# this timestamp is part of the previous calendar day session
ts = ts.replace(day=ts.day - 1)
else:
if cbh.end <= t or t < cbh.start:
return pd.NaT
return ts.replace(hour=cbh.start.hour, minute=cbh.start.minute,
second=cbh.start.second,
microsecond=cbh.start.microsecond)
# assuming df looks similar to the one in the problem statement...
cbh = CustomBusinessHour(start='06:30', end='13:15',
weekmask='Mon Tue Wed Thu Fri')
df['session_start'] = df.index.map(lambda x: session_start(x, cbh))
df.dropna(how='all', subset=['session_start'], inplace=True)
df.set_index(['session_start', df.index], drop=True, inplace=True)https://stackoverflow.com/questions/40123159
复制相似问题