我有一个列表(我的名单)80个5-D扎尔文件与以下结构(T,F,B,Az,El)。该数组的形状为24x4096x2016x24x8。
我希望提取切片数据,并使用以下函数沿某个轴运行一个概率
def GetPolarData(mylist, freq, FreqLo, FreqHi):
'''
This function will take the list of zarr files (T, F, B, Az, El), open them, used selected frequency to return an array
of files with Azimuth and Elevation probabilities
'''
ChanIndx = FreqCut(FreqLo, FreqHi,freq)
if len(ChanIndx) != 0:
MyData = []
for i in range(len(mylist)):
print('Adding file {} : {}'.format(i,mylist[i][32:]))
try:
zarrf = xr.open_zarr(mylist[i], group = 'arr')
m = zarrf.master.sum(dim = ['time','baseline'])
m = m[ChanIndx].sum(dim = ['frequency'])
c = zarrf.counter.sum(dim = ['time','baseline'])
c = c[ChanIndx].sum(dim = ['frequency'])
p = m.astype(float)/c.astype(float)
MyData.append(p)
except Exception as e:
print(e)
continue
else:
print("Something went wrong in Frequency selection")
print("##########################################")
print("This will be contribution to selected band")
print("##########################################")
print(f"Min {np.nanmin(MyData)*100:.3f}% ")
print(f"Max {np.nanmax(MyData)*100:.3f}% ")
print(f"Average {np.nanmean(MyData)*100:.3f}% ")
return(MyData) 如果我使用以下方法调用该函数,
FreqLo = 470.
FreqHi = 854.
MyTVData =np.array(GetPolarData(AllZarrList,Freq, FreqLo, FreqHi))我发现在一个40核256 GB内存上运行(超过3小时)需要这么长时间。
有办法让这个跑得更快吗?
谢谢
https://stackoverflow.com/questions/65826772
复制相似问题