我有缺陷。deque的每个元素由、time、和事件字段组成。因此,这类似于字典列表。数据总是按时间排序,从最老到最新。第一个元素是最古老的。
请注意,deque是无限,每次添加新元素时都使用未知时间。这意味着新元素可以在1分钟后或1小时之后添加。谁知道..。
data = [
{
"time": "07:14:40",
"event": 24
},
{
"time": "07:15:40",
"event": 394
},
{
"time": "07:16:40",
"event": 384
},
{
"time": "07:17:40",
"event": 394
},
{
"time": "07:18:40",
"event": 384
},
{
"time": "07:19:40",
"event": 2
},
{
"time": "07:20:40",
"event": 24
},
{
"time": "07:21:40",
"event": 72
},
{
"time": "07:22:40",
"event": 24
},
{
"time": "07:23:40",
"event": 72
},
{
"time": "07:24:40",
"event": 99
}
]我还得到了窗口的大小。就等5分钟吧。
我想用给定的窗口大小迭代这个deque,并计算展开移动和。让我详细说明这意味着什么。
在迭代过程中,在每次迭代过程中,如果它们在5分钟的窗口内,我必须检查当前元素和旧元素,并对它们进行汇总。如果旧元素在5分钟窗口之外,则从deque中弹出它们。
换句话说,在第一个迭代开始日期将是
07:09:40 - (going 5 minute back)结束日期是
07:14:40总和是24。在第二次迭代期间,由于此元素不在日期范围内,因此必须以下列方式重新定义日期范围:
开始日期是
07:10:40结束日期是
07:15:40现在,我必须回顾并检查所有以前的元素。第一个元素的日期是
07:14:40这是在我的新日期范围内,我将做新的总和(24 + 394)
在第三次迭代期间,时间字段超出了我以前的日期范围,然后我必须以与前一次迭代相同的方式重新定义我的日期范围,并以类似的方式进行所有的求和。
当我到达以下元素时(第7次迭代)
"time": "07:20:40",
"event": 24我的约会范围是:
开始日期:
07:15:40结束日期:
07:20:40,然后,我必须回头看,并抓取所有的元素,其中的时间场在这个日期范围内。注意,第一个元素在日期范围之外,我必须从deque.中弹出第一个元素--这是我的问题。我该怎么做?
这是我做过的代码片段,但它不起作用。
from collections import deque, defaultdictwindow_size = 300
test = deque(sort_data(list(read_json("final_real_test.json").values())[0]))
result = defaultdict(list)
final_input = deque()
end_date = test[0]["time"]
start_date = end_date - datetime.timedelta(seconds=window_size)
while test:
record = test.popleft()
if start_date <= record["time"] <= end_date:
# Calculate the sum
final_input.append(record)
else:
end_date = record["time"]
start_date = end_date - datetime.timedelta(seconds=window_size)
print("Returning back to the queue...")
test.appendleft(record)
print("Done")发布于 2022-11-30 06:22:38
您没有解释deque是如何更新的,以及它应该如何影响窗口处理。
但这里有一个算法的概念证明:
from datetime import datetime
from typing import Generator, List, Dict, Union
Element = Dict[str, Union[str, int]]
Series = List[Element]
def sliding_window(series: Series, window_duration: int) -> Generator[Series, None, None]:
time_format = "%H:%M:%S"
if len(series) > 0:
for i_ending_item, ending_item in enumerate(series):
end_window_time = datetime.strptime(ending_item["time"], time_format)
print(f"window ends at item n°{i_ending_item} ({end_window_time!r})")
window = [ending_item]
for window_candidate_item in reversed(series[0:max(i_ending_item, 0)]):
candidate_time = datetime.strptime(window_candidate_item["time"], time_format)
assert end_window_time > candidate_time
candidate_delta = end_window_time - candidate_time
print(f" {candidate_time=!r} {candidate_delta=!r} {candidate_delta.seconds=!r}")
if candidate_delta.seconds < window_duration: # non inclusive
print(" added to the window")
window.insert(0, window_candidate_item)
else:
print(" stop there")
break
else:
print(" reached the beginning of the series")
yield window
DATA: Series = [
{"time": "07:14:40", "event": 24},
{"time": "07:15:40", "event": 394},
{"time": "07:16:40", "event": 384},
{"time": "07:17:40", "event": 394},
{"time": "07:18:40", "event": 384},
{"time": "07:19:40", "event": 2},
{"time": "07:20:40", "event": 24},
{"time": "07:21:40", "event": 72},
{"time": "07:22:40", "event": 24},
{"time": "07:23:40", "event": 72},
{"time": "07:24:40", "event": 99}
]
WINDOW_SIZE = 5*60
for window in sliding_window(DATA, WINDOW_SIZE):
print(window, "sum=", sum(item["event"] for item in window))产
window ends at item n°0 (datetime.datetime(1900, 1, 1, 7, 14, 40))
reached the beginning of the series
[{'time': '07:14:40', 'event': 24}] sum= 24
window ends at item n°1 (datetime.datetime(1900, 1, 1, 7, 15, 40))
candidate_time=datetime.datetime(1900, 1, 1, 7, 14, 40) candidate_delta=datetime.timedelta(seconds=60) candidate_delta.seconds=60
added to the window
reached the beginning of the series
[{'time': '07:14:40', 'event': 24}, {'time': '07:15:40', 'event': 394}] sum= 418
window ends at item n°2 (datetime.datetime(1900, 1, 1, 7, 16, 40))
candidate_time=datetime.datetime(1900, 1, 1, 7, 15, 40) candidate_delta=datetime.timedelta(seconds=60) candidate_delta.seconds=60
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 14, 40) candidate_delta=datetime.timedelta(seconds=120) candidate_delta.seconds=120
added to the window
reached the beginning of the series
[{'time': '07:14:40', 'event': 24}, {'time': '07:15:40', 'event': 394}, {'time': '07:16:40', 'event': 384}] sum= 802
window ends at item n°3 (datetime.datetime(1900, 1, 1, 7, 17, 40))
candidate_time=datetime.datetime(1900, 1, 1, 7, 16, 40) candidate_delta=datetime.timedelta(seconds=60) candidate_delta.seconds=60
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 15, 40) candidate_delta=datetime.timedelta(seconds=120) candidate_delta.seconds=120
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 14, 40) candidate_delta=datetime.timedelta(seconds=180) candidate_delta.seconds=180
added to the window
reached the beginning of the series
[{'time': '07:14:40', 'event': 24}, {'time': '07:15:40', 'event': 394}, {'time': '07:16:40', 'event': 384}, {'time': '07:17:40', 'event': 394}] sum= 1196
window ends at item n°4 (datetime.datetime(1900, 1, 1, 7, 18, 40))
candidate_time=datetime.datetime(1900, 1, 1, 7, 17, 40) candidate_delta=datetime.timedelta(seconds=60) candidate_delta.seconds=60
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 16, 40) candidate_delta=datetime.timedelta(seconds=120) candidate_delta.seconds=120
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 15, 40) candidate_delta=datetime.timedelta(seconds=180) candidate_delta.seconds=180
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 14, 40) candidate_delta=datetime.timedelta(seconds=240) candidate_delta.seconds=240
added to the window
reached the beginning of the series
[{'time': '07:14:40', 'event': 24}, {'time': '07:15:40', 'event': 394}, {'time': '07:16:40', 'event': 384}, {'time': '07:17:40', 'event': 394}, {'time': '07:18:40', 'event': 384}] sum= 1580
window ends at item n°5 (datetime.datetime(1900, 1, 1, 7, 19, 40))
candidate_time=datetime.datetime(1900, 1, 1, 7, 18, 40) candidate_delta=datetime.timedelta(seconds=60) candidate_delta.seconds=60
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 17, 40) candidate_delta=datetime.timedelta(seconds=120) candidate_delta.seconds=120
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 16, 40) candidate_delta=datetime.timedelta(seconds=180) candidate_delta.seconds=180
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 15, 40) candidate_delta=datetime.timedelta(seconds=240) candidate_delta.seconds=240
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 14, 40) candidate_delta=datetime.timedelta(seconds=300) candidate_delta.seconds=300
stop there
[{'time': '07:15:40', 'event': 394}, {'time': '07:16:40', 'event': 384}, {'time': '07:17:40', 'event': 394}, {'time': '07:18:40', 'event': 384}, {'time': '07:19:40', 'event': 2}] sum= 1558
window ends at item n°6 (datetime.datetime(1900, 1, 1, 7, 20, 40))
candidate_time=datetime.datetime(1900, 1, 1, 7, 19, 40) candidate_delta=datetime.timedelta(seconds=60) candidate_delta.seconds=60
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 18, 40) candidate_delta=datetime.timedelta(seconds=120) candidate_delta.seconds=120
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 17, 40) candidate_delta=datetime.timedelta(seconds=180) candidate_delta.seconds=180
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 16, 40) candidate_delta=datetime.timedelta(seconds=240) candidate_delta.seconds=240
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 15, 40) candidate_delta=datetime.timedelta(seconds=300) candidate_delta.seconds=300
stop there
[{'time': '07:16:40', 'event': 384}, {'time': '07:17:40', 'event': 394}, {'time': '07:18:40', 'event': 384}, {'time': '07:19:40', 'event': 2}, {'time': '07:20:40', 'event': 24}] sum= 1188
window ends at item n°7 (datetime.datetime(1900, 1, 1, 7, 21, 40))
candidate_time=datetime.datetime(1900, 1, 1, 7, 20, 40) candidate_delta=datetime.timedelta(seconds=60) candidate_delta.seconds=60
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 19, 40) candidate_delta=datetime.timedelta(seconds=120) candidate_delta.seconds=120
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 18, 40) candidate_delta=datetime.timedelta(seconds=180) candidate_delta.seconds=180
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 17, 40) candidate_delta=datetime.timedelta(seconds=240) candidate_delta.seconds=240
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 16, 40) candidate_delta=datetime.timedelta(seconds=300) candidate_delta.seconds=300
stop there
[{'time': '07:17:40', 'event': 394}, {'time': '07:18:40', 'event': 384}, {'time': '07:19:40', 'event': 2}, {'time': '07:20:40', 'event': 24}, {'time': '07:21:40', 'event': 72}] sum= 876
window ends at item n°8 (datetime.datetime(1900, 1, 1, 7, 22, 40))
candidate_time=datetime.datetime(1900, 1, 1, 7, 21, 40) candidate_delta=datetime.timedelta(seconds=60) candidate_delta.seconds=60
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 20, 40) candidate_delta=datetime.timedelta(seconds=120) candidate_delta.seconds=120
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 19, 40) candidate_delta=datetime.timedelta(seconds=180) candidate_delta.seconds=180
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 18, 40) candidate_delta=datetime.timedelta(seconds=240) candidate_delta.seconds=240
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 17, 40) candidate_delta=datetime.timedelta(seconds=300) candidate_delta.seconds=300
stop there
[{'time': '07:18:40', 'event': 384}, {'time': '07:19:40', 'event': 2}, {'time': '07:20:40', 'event': 24}, {'time': '07:21:40', 'event': 72}, {'time': '07:22:40', 'event': 24}] sum= 506
window ends at item n°9 (datetime.datetime(1900, 1, 1, 7, 23, 40))
candidate_time=datetime.datetime(1900, 1, 1, 7, 22, 40) candidate_delta=datetime.timedelta(seconds=60) candidate_delta.seconds=60
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 21, 40) candidate_delta=datetime.timedelta(seconds=120) candidate_delta.seconds=120
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 20, 40) candidate_delta=datetime.timedelta(seconds=180) candidate_delta.seconds=180
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 19, 40) candidate_delta=datetime.timedelta(seconds=240) candidate_delta.seconds=240
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 18, 40) candidate_delta=datetime.timedelta(seconds=300) candidate_delta.seconds=300
stop there
[{'time': '07:19:40', 'event': 2}, {'time': '07:20:40', 'event': 24}, {'time': '07:21:40', 'event': 72}, {'time': '07:22:40', 'event': 24}, {'time': '07:23:40', 'event': 72}] sum= 194
window ends at item n°10 (datetime.datetime(1900, 1, 1, 7, 24, 40))
candidate_time=datetime.datetime(1900, 1, 1, 7, 23, 40) candidate_delta=datetime.timedelta(seconds=60) candidate_delta.seconds=60
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 22, 40) candidate_delta=datetime.timedelta(seconds=120) candidate_delta.seconds=120
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 21, 40) candidate_delta=datetime.timedelta(seconds=180) candidate_delta.seconds=180
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 20, 40) candidate_delta=datetime.timedelta(seconds=240) candidate_delta.seconds=240
added to the window
candidate_time=datetime.datetime(1900, 1, 1, 7, 19, 40) candidate_delta=datetime.timedelta(seconds=300) candidate_delta.seconds=300
stop there
[{'time': '07:20:40', 'event': 24}, {'time': '07:21:40', 'event': 72}, {'time': '07:22:40', 'event': 24}, {'time': '07:23:40', 'event': 72}, {'time': '07:24:40', 'event': 99}] sum= 291在我看来,它似乎回答了你的问题:如何有一个滑动窗口,根据事件的时间。
我简单地为数据使用了一个list。如果你想共享一个Minimal Reproducible Example,回答你的问题就更简单了。
https://stackoverflow.com/questions/74600559
复制相似问题