我试图重构一些用于拟合参数模型的代码,这些模型是用Theano象征性地定义的。我的目标是让所有的模型都公开一个通用的接口,这样,它们就可以尽可能地互相替代。为此,我尝试将每个模型封装在一个单独的类中。一个关键的要求是,我能够使用multiprocessing将相同的模型并行化到多个数据集(我使用的是joblib包装器)。
下面是我目前正在做的事情的一个可以运行的例子:
import numpy as np
import theano
from theano import tensor as te
from theano.gradient import jacobian, hessian
from scipy.optimize import minimize
from joblib import Parallel, delayed
class Rosenbrock(object):
"""The Rosenbrock function: f(x, y) = (a - x)^2 + b(y - x^2)^2 """
# symbolic variables - only used internally
_P = te.dvector('P')
_a, _b = _P[0], _P[1]
_xy = te.dmatrix('xy')
_x, _y = _xy[0], _xy[1]
_z = te.dvector('z')
_z_hat = (_a - _x) ** 2 + _b * (_y - _x ** 2) ** 2
_diff = _z - _z_hat
_loss = 0.5 * te.dot(_diff, _diff)
_jac = jacobian(_loss, _P)
_hess = hessian(_loss, _P)
# theano functions - part of the interface
forward = theano.function([_P, _xy], _z_hat)
loss = theano.function([_P, _xy, _z], _loss)
jacobian = theano.function([_P, _xy, _z], _jac)
hessian = theano.function([_P, _xy, _z], _hess)
@staticmethod
def initialize(xy, z):
"""
make some sensible estimate of what the initial parameters should be,
based on xy and z
"""
P0 = xy[:, np.argmin(z)]
return P0
@staticmethod
def _postfit(P):
"""
sometimes I want to make some adjustments to the parameters post-
fitting, e.g. wrapping angles between 0 and 2pi
"""
return P
def do_fit(model, *args):
"""
wrapper function that performs the fitting
"""
# initialize the model
P0 = model.initialize(*args)
# do the fit
res = minimize(model.loss, P0, args=args, method='Newton-CG',
jac=model.jacobian, hess=model.hessian)
P = res.x
# tweak the parameters
P = model._postfit(P)
# return the tweaked parameters
return P
def run(niter=2000):
# I don't actually need to instantiate this, since everything is
# effectively a class method...
model = Rosenbrock()
# some example data
xy = np.mgrid[-3:3:100j, -3:3:100j].reshape(2, -1)
P = np.r_[1., 100.]
z = model.forward(P, xy)
# run multiple fits in parallel
pool = Parallel(n_jobs=-1, verbose=1, pre_dispatch='all')
results = pool(delayed(do_fit)(model, xy, z) for _ in xrange(niter))
if __name__ == '__main__':
run()核心函数forward()、loss()、jacobian()和hessian()的行为类似于类的静态方法。我发现为了能够并行化拟合,Theano函数必须是类的属性,而不是实例的属性。否则(即,如果我在类的__init__()方法中定义这些函数),当我试图使用multiprocessing并行调用它们时,所发生的情况是工作线程彼此阻塞,这意味着它们实际上只使用单个内核。这大概是因为GIL不再被规避,虽然我真的不明白为什么会这样。
将Theano函数声明为类方法有两个非常不理想的结果:
a、b等)弄得乱七八糟,在调用theano.function()之后并不真正需要这些变量。为了对用户隐藏它们,我必须在变量名前面加上下划线,这使得代码更难阅读。theano.function()触发C代码的自动生成和编译,这是一个缓慢的过程.在同一个源文件中定义所有模型是最方便的,但这意味着每当我导入或重新加载该文件时,我必须等待所有模型重新编译。有人(尤其是在Theano有经验的人)能提出更好的方法来构造这个代码吗?
发布于 2017-09-10 23:51:30
这里有一个细节,可能会对你有所帮助。
考虑一下这一变式:
@staticmethod
def initialize(xy, z):
def helper(xy, z):
P0 = xy[:, np.argmin(z)]
return P0
return helper(xy, z)嵌套函数使P0远离类命名空间。
https://codereview.stackexchange.com/questions/60460
复制相似问题