文章/答案/技术大牛

发布

社区首页 >问答首页 >如何用python分析二进制时间序列

问如何用python分析二进制时间序列
EN

Stack Overflow用户

提问于 2022-04-14 09:17:41

回答 1查看 29关注 0票数 0

我希望你能原谅我糟糕的英语。

我希望像下面的Python(Pandas)一样分析二进制化的时间序列数据。

>>> import pandas as pd
>>> 
>>> s = pd.Series([0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0])
>>> type(s)
<class 'pandas.core.series.Series'>
>>> s
0     False
1     False
2     False
3      True
4      True
5      True
6      True
7     False
8     False
9     False
10    False
11     True
12     True
13    False
dtype: bool

我想提取值为True的索引的开始和停止。我试过以下几种方法。

>>> diff = s.diff().dropna()
>>> diff
1     False
2     False
3      True
4     False
5     False
6     False
7      True
8     False
9     False
10    False
11     True
12    False
13     True
dtype: object
>>> idxs = diff[diff].index.to_series()
>>> idxs
3      3
7      7
11    11
13    13
dtype: int64
>>> events = pd.concat(
        [idxs[0::2].reset_index(drop=True),
            idxs[1::2].reset_index(drop=True)],
        axis=1)\
        .apply(lambda r: pd.Interval(r[0], r[1]), axis=1)
>>> events
0      (3, 7]
1    (11, 13]
dtype: interval

通过这种方式，我成功地提取了数据。然而，这段代码似乎有点难看。我想可能会有更好的代码或者库来实现这一点。

如果你知道的话，如果你能告诉我，我会很感激的。我也不知道type(events) == pd.Series[pd.Interval]是否合适，请给我一个更好的主意。当然，要分析的实际数据要大得多。

python

pandas

time-series

回答 1

Stack Overflow用户

发布于 2022-04-14 09:25:59

以下是另一种选择：

pd.Series([pd.Interval(x.index[0], x.index[-1]+1)
           for _,x in s[s].groupby((~s).cumsum())])

或者，如果您没有范围索引：

m = s|s.shift()
pd.Series([pd.Interval(x.index[0], x.index[-1])
           for _,x in s[m].groupby((~m).cumsum())])

产出：

0      (3, 7]
1    (11, 13]
dtype: interval

已使用的投入：

s = pd.Series([0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0]).astype(bool)

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/71869239

复制

相似问题

问如何用python分析二进制时间序列
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何用python分析二进制时间序列EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何用python分析二进制时间序列
EN