我在这个小函数中为随机森林列车准备一个索引。通过这个函数,我得到了一组示例的索引,以及示例的一组特性子集的索引。我发现代码有点慢,有更好更快的方法吗?
这是我的密码
function [idx_linhas, idx_features ] = prepararsementes(X,features,nseeds,treesize)
%Esta função prepara os index para a "semeadura" de uma random forest
idx_linhas = nan(nseeds,treesize);
idx_features= nan(nseeds,features);
for idx=1:nseeds
[~,idx_linhas(idx,:)] = datasample(X,treesize,'Replace',true);
end
for idx=1:nseeds
[~,idx_features(idx,:)] = datasample(X,features,2);
end
idx_linhas = idx_linhas.';
end提前感谢!
发布于 2013-08-30 09:11:49
试试这个:
function [idx_linhas, idx_features] = prepararsementes(X, features, nseeds, treesize)
% instead of loop, call datasample() only once, and reshape
% note that ('replace', true) is the default, so I omitted that
[~,idx] = datasample(X, nseeds*treesize);
idx_linhas = reshape(idx, nseeds, treesize).';
[~,idx] = datasample(X, nseeds*features, 2);
idx_features = reshape(idx, nseeds, features);
end从统计学上讲,我认为结果应该是一样的,因为你没有具体的权重,而且在这两种情况下你都用替换画。
https://stackoverflow.com/questions/18528720
复制相似问题