我遇到了因果影响包无法识别数据中日期索引的问题。
我得到输入数据索引中不存在的错误20210626。下面的blob包括错误的跟踪。
ValueError Traceback (most recent call last)
Input In [97], in <cell line: 3>()
1 pre_period = ['20210626','20210628']
2 post_period = ['20210629','20210702']
----> 3 ci = CausalImpact(data, pre_period, post_period)
File ~/homebrew/lib/python3.8/site-packages/causalimpact/main.py:228, in CausalImpact.__init__(self, data, pre_period, post_period, model, alpha, **kwargs)
227 def __init__(self, data, pre_period, post_period, model=None, alpha=0.05, **kwargs):
--> 228 checked_input = self._process_input_data(
229 data, pre_period, post_period, model, alpha, **kwargs
230 )
231 super(CausalImpact, self).__init__(**checked_input)
232 self.model_args = checked_input['model_args']
File ~/homebrew/lib/python3.8/site-packages/causalimpact/main.py:377, in CausalImpact._process_input_data(self, data, pre_period, post_period, model, alpha, **kwargs)
374 raise ValueError('{args} input cannot be empty'.format(
375 args=', '.join(none_args)))
376 processed_data = self._format_input_data(data)
--> 377 pre_data, post_data = self._process_pre_post_data(processed_data, pre_period,
378 post_period)
379 alpha = self._process_alpha(alpha)
380 model_args = self._process_model_args(**kwargs)
File ~/homebrew/lib/python3.8/site-packages/causalimpact/main.py:658, in CausalImpact._process_pre_post_data(self, data, pre_period, post_period)
637 def _process_pre_post_data(self, data, pre_period, post_period):
638 """
639 Checks `pre_period`, `post_period` and returns data sliced accordingly to each
640 period.
(...)
656 ValueError: if pre_period last value is bigger than post intervention period.
657 """
--> 658 checked_pre_period = self._process_period(pre_period, data)
659 checked_post_period = self._process_period(post_period, data)
661 if checked_pre_period[1] > checked_post_period[0]:
File ~/homebrew/lib/python3.8/site-packages/causalimpact/main.py:727, in CausalImpact._process_period(self, period, data)
725 if isinstance(point, pd.Timestamp):
726 point = point.strftime('%Y%m%d')
--> 727 raise ValueError("{point} not present in input data index.".format(
728 point=str(point)
729 )
730 )
731 if isinstance(period[0], str) or isinstance(period[0], pd.Timestamp):
732 period = self._convert_str_period_to_int(period, data)
ValueError: 20210626 not present in input data index.
代码和示例数据如下所示。有人能帮忙吗?
import numpy as np
import pandas as pd
import tensorflow as tf
import tensorflow_probability as tfp
import matplotlib.pyplot as plt
from causalimpact import CausalImpact
data = pd.read_csv('~/datasets/results_covariates.csv',encoding='utf-8')
data.set_index('DT', inplace=True, drop=False)
pre_period = ['20210626','20210628']
post_period = ['20210629','20210702']
ci = CausalImpact(data, pre_period, post_period)DT Y X1 X2
6/26/21 1016.15 8616.033333 164
6/27/21 1174.983333 18156.85 444
6/28/21 56571.43333 417270.6 11664
6/29/21 64821.75 420466.3167 11322
6/30/21 178269.8 2331084.75 66434
7/1/21 62314.28333 391890.9 11221
7/2/21 141387.3833 1286635.85 35207发布于 2022-06-27 04:44:43
设法让这件事奏效了。
因果关系影响期望索引为int、str或pd.Timestamp,在实际代码中读取csv /数据库连接器后,dt列必须重新格式化。
data['DT'] = pd.to_datetime(data['DT']) https://stackoverflow.com/questions/72757892
复制相似问题