尝试使用joblib/multiprocessing并行运行命令时出现错误:
下面是回溯:
Process PoolWorker-263:
Traceback (most recent call last):
File "/home/marcel/anaconda/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/marcel/anaconda/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/home/marcel/anaconda/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/home/marcel/.local/lib/python2.7/site-packages/joblib/pool.py", line 363, in get
File "_objects.pyx", line 240, in h5py._objects.ObjectID.__cinit__ (h5py/_objects.c:2994)
TypeError: __cinit__() takes exactly 1 positional argument (0 given)正如您从错误消息中看到的,我使用的是使用h5py加载的数据。为了让事情更复杂,我想要并行化的例程在它的一个子例程中使用了numba,但我希望这无关紧要。
下面是一个正在运行的示例,您可以复制并粘贴它:
from joblib import Parallel,delayed
import numpy as np
import h5py as h5
import os
def testfunc(h5data, row):
# some very boneheaded CPU work
data_slice = h5data[:,row,...]
ma = np.mean(data_slice, axis = 1)
x = row
return ma, x
def run():
data = np.random.random((100,100,100))
print data
f_out = h5.File('tmp.h5', 'w')
dset = f_out.create_dataset('mydata', data = data )
f_out.close()
f_in = h5.File('tmp.h5', 'r')
h5data = f_in['mydata']
pool = Parallel(n_jobs=-1,verbose=1,pre_dispatch='all')
results = pool(delayed(testfunc)(h5data, i) for i in range(h5data.shape[1]))
f_in.close()
os.remove('tmp.h5')
if __name__ == '__main__':
run()有什么想法吗,我哪里做错了?
编辑:好吧,至少我可以把numba排除在恶人名单之外……
发布于 2017-07-31 20:09:23
1您可以尝试替换ˋjoblibwith [pathos][1] which replacespicklewithdill`.这通常解决了所有的酸洗问题。
https://stackoverflow.com/questions/31919984
复制相似问题