我正在测试numba能在多大程度上帮助加速一个经典的大都会算法,例如一个盒子里的硬盘:
def Metropolis2D(L, d_ex, n_steps, delta):
for steps in range(n_steps):
a = rm.choice(L)
b = [a[0] + rm.uniform(-delta, delta), a[1] + rm.uniform(-delta, delta)]
min_dist = min((b[0] - c[0]) ** 2 + (b[1] - c[1]) ** 2 for c in L if c != a)
box_cond = min(b[0], b[1]) <= d_ex or max(b[0], b[1]) >= 1.0 - d_ex
if not (box_cond or min_dist < 4.0 * d_ex ** 2):
a[:] = b
return L
Metropolis2D_jit = njit()(Metropolis2D)例如,让我们考虑以下设置:
ell = 4
N = ell**2 # number of disks
eta = 0.65 # disk density
d_ex = math.sqrt(eta / (math.pi * N)) # radius of the N disks at density eta
del_xy = 1 / (2 * ell) # helf lattice spacing for the initial configuration
L = [[del_xy + i * 2 * del_xy, del_xy + j * 2 * del_xy] for i in range(ell) for j in range(ell)]
n_steps = 1
delta = 2 * (del_xy - d_ex) # maximal displacement along x or y我们是从所有磁盘在方格上对齐,在单位方格上相对较高的密度(0.65)开始。那么,在非jitted大都市函数中的单个MC移动(即n_steps = 1)需要:
%timeit Metropolis2D_jit(L, d_ex, n_steps, delta)
7.01 µs ± 11.4 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)但是,在运行Metropolis2D_jit时,我会得到以下错误:
UnsupportedError: Failed in nopython mode pipeline (step: inline calls to locally defined closures)
The use of yield in a closure is unsupported.
File "../../../var/folders/q9/jzmgl38d1h5gnmkllt_ly8rr0000gn/T/ipykernel_28212/2383793983.py", line 6:
<source missing, REPL/exec in use?>你对如何绕过这个问题并利用numba提速有什么想法吗?
事实上,一种可能的方法是编写一个部分抛出的函数,其中磁盘与磁盘之间的距离在numba中被抛出。但是,我很好奇是否有一个完整的jit函数Metropolis2D的解决方案。
发布于 2022-07-07 13:43:27
谢谢@Jér me Richard。你说得对:这与约束检查无关(我正在修改这个问题)。我通过循环修正了min,并纠正了由于采用rm.choice()而引起的一些问题。这是一个尝试过的解决方案,现在在numba中编译,没有错误:
import numba as nb
import random as rm
@nb.njit()
def dist_sq_jit(b, c):
return (b[0] - c[0]) ** 2 + (b[1] - c[1]) ** 2
@nb.njit()
def Metropolis2D_jit(L, d_ex, n_steps, delta):
for steps in range(n_steps):
ran_ind = np.random.randint(0, len(L))
a = L[ran_ind]
b = [a[0] + rm.uniform(-delta, delta), a[1] + rm.uniform(-delta, delta)]
L_sliced = np.concatenate((L[:ran_ind], L[ran_ind + 1:]), axis = 0)
min_dist = np.zeros(len(L))
i = -1
for c in L_sliced:
i += 1
min_dist[i] = dist_sq_jit(b, c)
if (min(b) >= d_ex and max(b) <= 1.0 - d_ex and min(min_dist) >= 4.0 * d_ex ** 2):
L[ran_ind] = b
return L计算时间(与上面相同的参数设置,只更改:l现在是numpy数组)大约减少了5倍:
L_vec = np.array(L)
%timeit Metropolis2D_jit(L_vec, d_ex, n_steps, delta)
1.39 µs ± 30.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)也许还有进一步改进的余地?
https://stackoverflow.com/questions/72890593
复制相似问题