我试图使用引导使1000个复制子(np.random.choice)与替换重采样,这样我就可以计算每个复制的平均值。然后,我将比较这些平均值的标准差和标准。
然而,我没有正确的引导部分,如何修复呢?
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
from scipy import stats
df = pd.read_csv('http://www.math.uah.edu/stat/data/Pearson.txt',
delim_whitespace=True)
df.head()
y = df['Son'].values
Replications = np.random.choice(y, 1000, replace = True)
print("Replications: " , Replications)
print("")
Mean = np.mean(Replications)
print("Mean: " , Mean)
sem = stats.sem(y)
print ("The SEM : ", sem)发布于 2016-12-05 21:14:13
您可以创建1000个长度为len(df)的副本,如下所示:
Replications = np.array([np.random.choice(df.Son, len(df), replace = True) for _ in range(1000)])
Mean = np.mean(Replications, axis=1)
print("Mean: " , Mean)谢谢!
https://stackoverflow.com/questions/40982819
复制相似问题