我正在尝试优化Matlab代码,以便对大量数据(1e6值)进行统计计算。我尝试了几种方法,使用循环或有趣的函数,使用diff或基础数学。基本上,我需要计算一组数据的累积量和标准差。
我不能让它在24秒内运行。有没有办法在不使用额外工具箱的情况下改进这段代码?
这是我之前尝试过的:
clear
close
myData = rand(1e5, 1)/5e6;
M = 1000;
N = length(myData)-M;
PkPk = NaN(M, 1);
Std = NaN(M, 1);
myMat = NaN (1, N);
%%%%%%%%%%%%%%%%%%%%%%%%%% peak2peak is part of Signal Processing Toolbox:
%%%%%%%%%%%%%%%%%%%%%%%%%% can use max()-min()
tic
for x = 1 : M
myMat = diff( (reshape(myData(1:x*floor(N/x)),x,floor(N/x)))') ;
PkPk (x) = peak2peak(myMat(:)) ;
Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat));
end
Time1 = toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1 : M
myMat = bsxfun(@minus, myData(x+1 : x+N) , myData(1:N)) '; % EDIT HERE: transpose
PkPk (x) = peak2peak(myMat(:)) ; % max - min
Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat)); % std
end
Time2 = toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1 : M
myMat = myData(x+1 : x+N) - myData(1:N);%
PkPk (x) = peak2peak(myMat(:)) ; % max - min
Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat)); % std
end
Time3 = toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1 : M
Std(x) = std( reshape( diff(reshape( myData(1:x*floor(N/x)) , x ,floor(N/x))'), floor(N/x)' * x -x, 1 ) ) ;
PkPk(x) = peak2peak( reshape( diff(reshape( myData(1:x*floor(N/x)) , x ,floor(N/x))'), floor(N/x)' * x -x, 1 ) );
end
Time4 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
for x = 1 : M
PkPk (M) = peak2peak( myData(x+1 : x+N) - myData(1:N)) ;
Std(M) = std( myData(x+1 : x+N) - myData(1:N)) ;
end
Time5 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
PkPk = (cellfun(@(x) peak2peak( reshape( diff(reshape( myData(1:x*floor(N/x)) , x ,floor(N/x))'), floor(N/x)' * x -x, 1 ) ) , num2cell(1:M) ));
Std = (cellfun(@(x) std( reshape( diff(reshape( myData(1:x*floor(N/x)) , x ,floor(N/x))'), floor(N/x)' * x -x, 1 ) ) , num2cell(1:M) ));
Time6 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
PkPk = cellfun( @(x) peak2peak( myData(x:N+x-1) - myData(1:N) ) , num2cell(1:M) ) ;
Std = cellfun( @(x) std( myData(x:N+x-1) - myData(1:N) ) , num2cell(1:M) ) ;
Time7 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
Std = cellfun( @(x) std( myData(x+1 : x+N) - myData(1:N)), num2cell(1:M) ) ;
PkPk = cellfun( @(x) max( myData(x+1 : x+N) - myData(1:N)) - min( myData(x+1 : x+N) - myData(1:N)) , num2cell(1:M) );
Time8 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
tic
Std = arrayfun( @(x) std( myData(x+1 : x+N) - myData(1:N)), (1:M) ) ;
PkPk = arrayfun( @(x) peak2peak( myData(x+1 : x+N) - myData(1:N)) , (1:M) );
Time9 =toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%这是我的时间结果(以秒为单位):
Time1: 24.47
Time2: 23.56
Time3: 25.20
Time4: 45.44
Time5: 42.99
Time6: 46.27
Time7: 43.62
Time8: 62.49
Time9: 41.69谢谢!
发布于 2014-04-15 11:43:58
我采用了你的第二个解决方案(你的基准测试中最快的),并做了一些修改。
如果您停止在每次循环迭代中访问myData(1:N),并在循环之前将其分配给一个数组,则可以实现性能改进,如下所示:
tic
myData1toN = myData(1:N);
for x = 1 : M
myMat = bsxfun(@minus, myData(x+1 : x+N) , myData1toN);
PkPk (x) = peak2peak(myMat(:)) ; % max - min
Std(x) = sqrt(sum(sum((myMat-mean(myMat(:))).^2))/numel(myMat)); % std
end
clear myData1toN;
Time2 = toc之前的时间:
Time2: 20.5618之后的时间:
Time2: 14.2260另一个修改:可以将sum(sum(...更改为sum(...,因为外部和只是单个值的总和。
之后的时间:
Time2: 11.6573顺便说一句,N可以取代numel(myMat),但我没有注意到性能上的改进。
https://stackoverflow.com/questions/23063050
复制相似问题