这本质上是“多个硬币来自多个薄荷糖/棒球球员”的例子,做贝叶斯数据分析,第二版(DBDA2)。我相信我的PyMC3代码在功能上是等价的,但是一个可以工作,另一个不能工作。这是与PyMC版本3.5。更详细地说,
假设我有以下数据。每一行都是一个观察:
observations_dict = {
'mint': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
'coin': [0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 7],
'outcome': [1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1]
}
observations = pd.DataFrame(observations_dict)
observations一枚铸币,几枚硬币
下面实现了DBDA2图9.7,运行良好:

num_coins = observations['coin'].nunique()
coin_idx = observations['coin']
with pm.Model() as hierarchical_model:
# mint is characterized by omega and kappa
omega = pm.Beta('omega', 1., 1.)
kappa_minus2 = pm.Gamma('kappa_minus2', 0.01, 0.01)
kappa = pm.Deterministic('kappa', kappa_minus2 + 2)
# each coin is described by a theta
theta = pm.Beta('theta', alpha=omega*(kappa-2)+1, beta=(1-omega)*(kappa-2)+1, shape=num_coins)
# define the likelihood
y = pm.Bernoulli('y', theta[coin_idx], observed=observations['outcome']) 很多薄荷糖,很多硬币
但是,一旦将其转化为分层模型(如DBDA2图9.13所示):

num_mints = observations['mint'].nunique()
mint_idx = observations['mint']
num_coins = observations['coin'].nunique()
coin_idx = observations['coin']
with pm.Model() as hierarchical_model2:
# Hyper parameters
omega = pm.Beta('omega', 1, 1)
kappa_minus2 = pm.Gamma('kappa_minus2', 0.01, 0.01)
kappa = pm.Deterministic('kappa', kappa_minus2 + 2)
# Parameters for mints
omega_c = pm.Beta('omega_c',
omega*(kappa-2)+1, (1-omega)*(kappa-2)+1,
shape = num_mints)
kappa_c_minus2 = pm.Gamma('kappa_c_minus2',
0.01, 0.01,
shape = num_mints)
kappa_c = pm.Deterministic('kappa_c', kappa_c_minus2 + 2)
# Parameters for coins
theta = pm.Beta('theta',
omega_c[mint_idx]*(kappa_c[mint_idx]-2)+1,
(1-omega_c[mint_idx])*(kappa_c[mint_idx]-2)+1,
shape = num_coins)
y2 = pm.Bernoulli('y2', p=theta[coin_idx], observed=observations['outcome'])错误是:
ValueError: operands could not be broadcast together with shapes (8,) (20,) 由于模型有8台泰塔斯的8枚硬币,但可以看到20行数据。
但是,如果将数据分组,使每一行表示单个硬币的最终统计数据,如下所示
grouped = observations.groupby(['mint', 'coin']).agg({'outcome': [np.sum, np.size]}).reset_index()
grouped.columns = ['mint', 'coin', 'heads', 'total']最后的似然变量改为二项式,如下所示
num_mints = grouped['mint'].nunique()
mint_idx = grouped['mint']
num_coins = grouped['coin'].nunique()
coin_idx = grouped['coin']
with pm.Model() as hierarchical_model2:
# Hyper parameters
omega = pm.Beta('omega', 1, 1)
kappa_minus2 = pm.Gamma('kappa_minus2', 0.01, 0.01)
kappa = pm.Deterministic('kappa', kappa_minus2 + 2)
# Parameters for mints
omega_c = pm.Beta('omega_c',
omega*(kappa-2)+1, (1-omega)*(kappa-2)+1,
shape = num_mints)
kappa_c_minus2 = pm.Gamma('kappa_c_minus2',
0.01, 0.01,
shape = num_mints)
kappa_c = pm.Deterministic('kappa_c', kappa_c_minus2 + 2)
# Parameter for coins
theta = pm.Beta('theta',
omega_c[mint_idx]*(kappa_c[mint_idx]-2)+1,
(1-omega_c[mint_idx])*(kappa_c[mint_idx]-2)+1,
shape = num_coins)
y2 = pm.Binomial('y2', n=grouped['total'], p=theta, observed=grouped['heads'])一切都正常。现在,后一种形式更有效率,而且普遍更受欢迎,但我认为前者也应该有效。因此,我认为这主要是一个PyMC3问题(甚至更可能是用户错误)。
引用DBDA第1版,
“BUGS模型使用二项似然分布来进行完全正确,而不是在个别试验中使用Bernoulli分布。这种使用二项分布只是为了缩短程序。如果数据被指定为试用结果而不是完全正确,则该模型可以包含一个逐次试验循环,并使用Bernoulli似然函数。”
让我困扰的是,在第一个例子(一个薄荷,几个硬币)中,PyMC3似乎可以处理单个的观察,而不是很好的聚合观察。所以我认为第一种形式是可行的,但却行不通。
代码
http://nbviewer.jupyter.org/github/JWarmenhoven/DBDA-python/blob/master/Notebooks/Chapter%209.ipynb
参考文献
PyMC3 - Differences in ways observations are passed to model -> difference in results?
http://www.databozo.com/deep-in-the-weeds-complex-hierarchical-models-in-pymc3
https://stats.stackexchange.com/questions/157521/is-this-correct-hierarchical-bernoulli-model
发布于 2018-10-06 18:41:27
mint_idx的长度是20 (每次观察一条),但应该是8条(每枚硬币一条)。
工作答案,请注意mint_idx重新计算(rest保持不变):
grouped = observations.groupby(['mint', 'coin']).agg({'outcome': [np.sum, np.size]}).reset_index()
grouped.columns = ['mint', 'coin', 'heads', 'total']
num_mints = grouped['mint'].nunique()
mint_idx = grouped['mint']
num_coins = observations['coin'].nunique()
coin_idx = observations['coin']
with pm.Model() as hierarchical_model2:
# Hyper parameters
omega = pm.Beta('omega', 1, 1)
kappa_minus2 = pm.Gamma('kappa_minus2', 0.01, 0.01)
kappa = pm.Deterministic('kappa', kappa_minus2 + 2)
# Parameters for mints
omega_c = pm.Beta('omega_c',
omega*(kappa-2)+1, (1-omega)*(kappa-2)+1,
shape = num_mints)
kappa_c_minus2 = pm.Gamma('kappa_c_minus2',
0.01, 0.01,
shape = num_mints)
kappa_c = pm.Deterministic('kappa_c', kappa_c_minus2 + 2)
# Parameters for coins
theta = pm.Beta('theta',
omega_c[mint_idx]*(kappa_c[mint_idx]-2)+1,
(1-omega_c[mint_idx])*(kappa_c[mint_idx]-2)+1,
shape = num_coins)
y2 = pm.Bernoulli('y2', p=theta[coin_idx], observed=observations['outcome'])非常感谢@junpenglao!!https://discourse.pymc.io/t/why-cant-i-use-a-bernoulli-as-a-likelihood-variable-in-a-hierarchical-model-in-pymc3/2022/2
https://stackoverflow.com/questions/52673649
复制相似问题