我正在为我的数据训练mahout分类器,遵循我发出的创建mahout模型的命令
./bin/mahout seqdirectory -i /tmp/mahout-work-root/MyData-all -o /tmp/mahout-work-root/MyData-seq
./bin/mahout seq2sparse -i /tmp/mahout-work-root/MyData-seq -o /tmp/mahout-work-root/MyData-vectors -lnorm -nv -wt tfidf
./bin/mahout split -i /tmp/mahout-work-root/MyData-vectors/tfidf-vectors --trainingOutput /tmp/mahout-work-root/MyData-train-vectors --testOutput /tmp/mahout-work-root/MyData-test-vectors --randomSelectionPct 40 --overwrite --sequenceFiles -xm sequential
./bin/mahout trainnb -i /tmp/mahout-work-root/Mydata-train-vectors -el -o /tmp/mahout-work-root/model -li /tmp/mahout-work-root/labelindex -ow当我尝试使用trainnb命令创建模型时,我得到了以下异常:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(BayesUtils.java:119) at org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.createLabelIndex(TrainNaiveBayesJob.java:152)
这里会有什么问题呢?
注意:原始示例提到的here运行良好。
发布于 2013-01-19 15:55:19
我认为这可能是你如何放置你的培训文件的问题。文件应按如下方式组织:
MyData-All
\classA
-file1
-file2
-...\classB
-filex……
https://stackoverflow.com/questions/14151877
复制相似问题