我有这个串行代码,我正在尝试使用MPI将其转换为并行代码。然而,我似乎无法让MPI_Scatter()函数在不崩溃的情况下正常工作。该函数循环遍历一个名为cells的数组,并修改其中的一些值。
以下是原始序列码:
int accelerate_flow(const t_param params, t_speed* cells, int* obstacles)
{
register int ii,jj; /* generic counters */
register float w1,w2; /* weighting factors */
/* compute weighting factors */
w1 = params.density * params.accel * oneover9;
w2 = params.density * params.accel * oneover36;
int i;
/* modify the first column of the grid */
jj=0;
for(ii=0;ii<params.ny;ii++)
{
if( !obstacles[ii*params.nx] && (cells[ii*params.nx].speeds[3] > w1 &&
cells[ii*params.nx].speeds[6] > w2 && cells[ii*params.nx].speeds[7] > w2))
{
/* increase 'east-side' densities */
cells[ii*params.nx].speeds[1] += w1;
cells[ii*params.nx].speeds[5] += w2;
cells[ii*params.nx].speeds[8] += w2;
/* decrease 'west-side' densities */
cells[ii*params.nx].speeds[3] -= w1;
cells[ii*params.nx].speeds[6] -= w2;
cells[ii*params.nx].speeds[7] -= w2;
}
}
return EXIT_SUCCESS;
}下面是我使用MPI的尝试:
int accelerate_flow(const t_param params, t_speed* cells, int* obstacles, int myrank, int ntasks)
{
register int ii,jj = 0;; /* generic counters */
register float w1,w2; /* weighting factors */
int recvSize;
int cellsSendTag = 123, cellsRecvTag = 321;
int size = params.ny / ntasks, i;
MPI_Request* cellsSend, *cellsRecieve;
MPI_Status *status;
/* compute weighting factors */
w1 = params.density * params.accel * oneover9;
w2 = params.density * params.accel * oneover36;
t_speed* recvCells = (t_speed*)malloc(size*sizeof(t_speed)*params.nx);
MPI_Scatter(cells, sizeof(t_speed)*params.nx*params.ny, MPI_BYTE, recvCells,
size*sizeof(t_speed)*params.nx, MPI_BYTE, 0, MPI_COMM_WORLD);
for(ii= 0;ii < size;ii++)
{
if( !obstacles[ii*params.nx] && (recvCells[ii*params.nx].speeds[3] > w1 &&
recvCells[ii*params.nx].speeds[6] > w2 && recvCells[ii*params.nx].speeds[7] > w2))
{
/* increase 'east-side' densities */
recvCells[ii*params.nx].speeds[1] += w1;
recvCells[ii*params.nx].speeds[5] += w2;
recvCells[ii*params.nx].speeds[8] += w2;
/* decrease 'west-side' densities */
recvCells[ii*params.nx].speeds[3] -= w1;
recvCells[ii*params.nx].speeds[6] -= w2;
recvCells[ii*params.nx].speeds[7] -= w2;
}
}
MPI_Gather(recvCells, size*sizeof(t_speed)*params.nx, MPI_BYTE, cells, params.ny*sizeof(t_speed)*params.nx, MPI_BYTE, 0, MPI_COMM_WORLD);
return EXIT_SUCCESS;
}下面是t_speed的结构:
typedef struct {
float speeds[NSPEEDS];
} t_speed;params.nx = 300,params.ny = 200
会非常感谢任何人的帮助。谢谢。
发布于 2012-08-26 22:01:39
MPI_Scatter的第一个count参数是要发送到每个进程的元素数,而不是总数。在这里,发送计数和接收计数将是相同的,并且将是nx*ny/ntask;因此,您将得到如下内容
int count=params.nx*params.ny/ntasks;
MPI_Scatter(cells, sizeof(t_speed)*count, MPI_BYTE,
recvCells,sizeof(t_speed)*count, MPI_BYTE, 0, MPI_COMM_WORLD);请注意,只有当ntasks平均分配nx*ny时,这才能起作用,否则您将不得不使用Scatterv。
https://stackoverflow.com/questions/12130329
复制相似问题