首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >python程序并行处理出错

python程序并行处理出错
EN

Stack Overflow用户
提问于 2021-02-09 21:23:56
回答 1查看 44关注 0票数 0
代码语言:javascript
复制
def bagging_and_trees_growth(samples, network, tree_num):
    trees = []
    
    for i in range(tree_num):
        bootstrap_samples = bagging(samples)
        a_tree = tree_growth(network, bootstrap_samples)
        trees.append(a_tree)
        
    return trees
    
def agiled_random_forest(samples, network, size, processes=39):
   
    rforest = []
        
    #job_server = pp.Server(processes=processes)
    threadPool = ThreadPool(processes=processes)
     
    depfun = (find_best_split, stopping_condition, purity_gain, Gini_index, find_neighbors, tree_growth, bagging)
    dep_modules = ('networkx', 'numpy', 'math', 'random', 'sys', 'pNGF')
    
    
    tree_num_of_each_task = int(size/processes)
    
    #jobs = [pp_server.submit(bagging_and_trees_growth, (samples, network, tree_num_of_each_task), depfun, 'dep_modules) for x in range(processes)]
   
    jobs = [threadPool.apply_async(bagging_and_trees_growth, (samples, network, tree_num_of_each_task), depfun, dep_modules) for x in range(processes)]
    
    
    for job in jobs:
        rforest += job.get()
    
    threadPool.destroy()
    return rforest

它显示了映射和元组的错误

代码语言:javascript
复制
TypeError: bagging_and_trees_growth() argument after ** must be a mapping, not tuple

如何解决这个错误,因为pp鼠标在python3中无法工作?

EN

回答 1

Stack Overflow用户

发布于 2021-02-09 21:36:37

您可能正在寻找这样的东西。

这里的想法是,bagging_and_trees_growth不再有一个内部作业循环;我们依靠线程池(或者,最好是GIL考虑的进程池,但这取决于您)来高效地处理分发工作。

由于作业的执行顺序在这里显然没有区别,因此imap_unordered将是最快的高级构造。一个人也可以使用apply_async,但它的工作量更大。

代码语言:javascript
复制
import itertools
import multiprocessing.pool


def bagging_and_trees_growth(job):
    samples, network = job  # unpack the job tuple
    bootstrap_samples = bagging(samples)
    a_tree = tree_growth(network, bootstrap_samples)
    return a_tree


def agiled_random_forest(samples, network, size, processes=39):
    rforest = []
    with multiprocessing.pool.ThreadPool(processes=processes) as pool:
        # to use imap_unordered (the fastest high-level pool operation),
        # we need to pack each job into an object; since all we need here is 2 parameters, let's use a tuple.
        # set up a generator to generate the same job size times
        job_gen = itertools.repeat((samples, network), size)
        # do the work in parallel
        for result in pool.imap_unordered(bagging_and_trees_growth, job_gen):
            # could do something else with the result here;
            # in fact this could all just be `rforest = list(pool.imap...)`
            # in the simple case
            rforest.append(result)
    return rforest
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66119891

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档