文章/答案/技术大牛

发布

社区首页 >问答首页 >如何使用Python重复某个命令(BOOTSTRAP重采样

问如何使用Python重复某个命令(BOOTSTRAP重采样
EN

Stack Overflow用户

提问于 2020-10-28 22:00:21

回答 1查看 262关注 0票数 0

我有一个数据帧(长度为4个数据点)，并且想要做X次Bootstrap。

数据帧示例：

我想出了用于Bootstrap重采样的代码

      boot = resample(df, replace=True, n_samples=len(df), random_state=1)
      print('Bootstrap Sample: %s' % boot)

但现在我想重复X次。我该怎么做呢？

x=20的输出。

  Sample Nr.    Index A B
      1         0   1 2
                1   1 2
                2   1 2
                3   1 2 
     ...
      20        0   1 2
                1   1 2
                1   1 2
                2   1 2

谢谢你们。

最好的

python

dataframe

resampling

statistics-bootstrap

回答 1

Stack Overflow用户

发布于 2020-10-28 22:52:30

方法1:并行采样数据

由于调用数据帧的样本方法n time可能很耗时，因此可以考虑并行应用sample方法。

import multiprocessing
from itertools import repeat

def sample_data(df, replace, random_state):
    '''Generate one sample of size len(df)'''
    return df.sample(replace=replace, n=len(df), random_state=random_state)

def resample_data(df, replace, n_samples, random_state):
    '''Call n_samples time the sample method parallely'''
    
    # Invoke lambda in parallel
    pool = multiprocessing.Pool(multiprocessing.cpu_count())
    bootstrap_samples = pool.starmap(sample_data, zip(repeat(df, n_samples), repeat(replace), repeat(random_state)))
    pool.close()
    pool.join()

    return bootstrap_samples

现在，如果我想生成15个样本，resample_data将从df返回一个包含15个样本的列表。

samples = resample_data(df, True, n_samples=15, random_state=1)

请注意，为了返回不同的结果，将random_state设置为None会很方便。

方法2:线性采样数据

另一种采样数据的方法是通过列表理解，因为已经定义了函数sample_data，所以可以直接在列表中调用它。

def resample_data_linearly(df, replace, n_samples, random_state):
    
    return [sample_data(df, replace, random_state) for _ in range(n_samples)] 

# Generate 10 samples of size len(df)
samples = resample_data_linearly(df, True, n_samples=10, random_state=1)

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/64574442

复制

相似问题

问如何使用Python重复某个命令(BOOTSTRAP重采样
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用Python重复某个命令(BOOTSTRAP重采样EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用Python重复某个命令(BOOTSTRAP重采样
EN