这是tsv文件。c2is2r3.tsv
The O
fate O
of O
Lehman ORGANIZATION
Brothers ORGANIZATION
. . .
New ORGANIZATION
York ORGANIZATION
Fed ORGANIZATION
, O
and O
Treasury TITLE
Secretary TITLE
Henry PERSON
M. PERSON
Paulson PERSON
Jr. PERSON
. O更多c2is2r3.支柱
trainFile = c2is2r3.tsv
serializeTo = c2is2r3-ner-model.ser.gz
map = word=0,answer=1
useClassFeature=true
useWord=true
useNGrams=true
noMidNGrams=true
maxNGramLeng=6
usePrev=true
useNext=true
useSequences=true
usePrevSequences=true
maxLeft=1
useTypeSeqs=true
useTypeSeqs2=true
useTypeySequences=true
wordShape=chris2useLC
useDisjunctive=true这是最初的序列
java -cp stanford-ner-3.5.2.jar edu.stanford.nlp.ie.crf.CRFClassifier -prop c2is2r3.prop
java -cp stanford-ner-3.5.2.jar -mx2g edu.stanford.nlp.ie.NERClassifierCombiner -ner.model c2is2r3-ner-model.ser.gz,classifiers/english.muc.7class.distsim.crf.ser.gz -ner.useSUTime false -ner.combinationMode HIGH_RECALL -serializeTo c2is2.serialized.ncc.ncc.ser.gz
java -cp stanford-ner-3.5.2.jar -mx1g edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier c2is2.serialized.ncc.ncc.ser.gz -textFile c2is2r3.txt
CRFClassifier invoked on Fri Jul 17 09:51:13 EDT 2015 with arguments:
-loadClassifier c2is2.serialized.ncc.ncc.ser.gz -textFile c2is2r3.txt
loadClassifier=c2is2.serialized.ncc.ncc.ser.gz
textFile=c2is2r3.txt
Loading classifier from /mnt/hgfs/share/nlp/stanford-ner-2015-04-20/c2is2.serialized.ncc.ncc.ser.gz ... Error deserializing /mnt/hgfs/share/nlp/stanford-ner-2015-04-20/c2is2.serialized.ncc.ncc.ser.gz
Exception in thread "main" java.lang.RuntimeException: java.lang.ClassCastException: java.util.Properties cannot be cast to [Ledu.stanford.nlp.util.Index;
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifierNoExceptions(AbstractSequenceClassifier.java:1572)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifierNoExceptions(AbstractSequenceClassifier.java:1523)
at edu.stanford.nlp.ie.crf.CRFClassifier.main(CRFClassifier.java:2987)
Caused by: java.lang.ClassCastException: java.util.Properties cannot be cast to [Ledu.stanford.nlp.util.Index;
at edu.stanford.nlp.ie.crf.CRFClassifier.loadClassifier(CRFClassifier.java:2613)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1451)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1558)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifierNoExceptions(AbstractSequenceClassifier.java:1569)
... 2 more这是试图使用NERClassifierCombiner
java -cp stanford-ner-3.5.2.jar -mx1g edu.stanford.nlp.ie.NERClassifierCombiner -loadClassifier c2is2.serialized.ncc.ncc.ser.gz -testFile c2is2r3.txt这是错误堆栈:
NERClassifierCombiner invoked on Fri Jul 17 10:11:17 EDT 2015 with arguments:
-loadClassifier c2is2.serialized.ncc.ncc.ser.gz -testFile c2is2r3.txt
testFile=c2is2r3.txt
loadClassifier=c2is2.serialized.ncc.ncc.ser.gz
testFile=c2is2r3.txt
ner.useSUTime=false
ner.model=c2is2r3-ner-model.ser.gz,classifiers/english.muc.7class.distsim.crf.ser.gz
serializeTo=c2is2.serialized.ncc.ncc.ser.gz
loadClassifier=c2is2.serialized.ncc.ncc.ser.gz
ner.combinationMode=HIGH_RECALL
loading CRF...
loading CRF...
Error on line 1: The fate of Lehman Brothers, the beleaguered investment bank, hung in the balance on Sunday as Federal Reserve officials and the leaders of major financial institutions continued to gather in emergency meetings trying to complete a plan to rescue the stricken bank. Several possible plans emerged from the talks, held at the Federal Reserve Bank of New York and led by Timothy R. Geithner, the president of the New York Fed, and Treasury Secretary Henry M. Paulson Jr.
Exception in thread "main" java.lang.UnsupportedOperationException: Argument array lengths differ: [word, tag, answer] vs. [The, fate, of, Lehman, Brothers,, the, beleaguered, investment, bank,, hung, in, the, balance, on, Sunday, as, Federal, Reserve, officials, and, the, leaders, of, major, financial, institutions, continued, to, gather, in, emergency, meetings, trying, to, complete, a, plan, to, rescue, the, stricken, bank., Several, possible, plans, emerged, from, the, talks,, held, at, the, Federal, Reserve, Bank, of, New, York, and, led, by, Timothy, R., Geithner,, the, president, of, the, New, York, Fed,, and, Treasury, Secretary, Henry, M., Paulson, Jr.]
at edu.stanford.nlp.ling.CoreLabel.initFromStrings(CoreLabel.java:153)
at edu.stanford.nlp.ling.CoreLabel.<init>(CoreLabel.java:133)
at edu.stanford.nlp.sequences.ColumnDocumentReaderAndWriter$ColumnDocParser.apply(ColumnDocumentReaderAndWriter.java:85)
at edu.stanford.nlp.sequences.ColumnDocumentReaderAndWriter$ColumnDocParser.apply(ColumnDocumentReaderAndWriter.java:60)
at edu.stanford.nlp.objectbank.DelimitRegExIterator.parseString(DelimitRegExIterator.java:67)
at edu.stanford.nlp.objectbank.DelimitRegExIterator.setNext(DelimitRegExIterator.java:60)
at edu.stanford.nlp.objectbank.DelimitRegExIterator.<init>(DelimitRegExIterator.java:54)
at edu.stanford.nlp.objectbank.DelimitRegExIterator$DelimitRegExIteratorFactory.getIterator(DelimitRegExIterator.java:122)
at edu.stanford.nlp.sequences.ColumnDocumentReaderAndWriter.getIterator(ColumnDocumentReaderAndWriter.java:54)
at edu.stanford.nlp.objectbank.ObjectBank$OBIterator.setNextObject(ObjectBank.java:436)
at edu.stanford.nlp.objectbank.ObjectBank$OBIterator.<init>(ObjectBank.java:415)
at edu.stanford.nlp.objectbank.ObjectBank.iterator(ObjectBank.java:253)
at edu.stanford.nlp.sequences.ObjectBankWrapper.iterator(ObjectBankWrapper.java:52)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyAndWriteAnswers(AbstractSequenceClassifier.java:1160)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyAndWriteAnswers(AbstractSequenceClassifier.java:1111)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.classifyAndWriteAnswers(AbstractSequenceClassifier.java:1071)
at edu.stanford.nlp.ie.NERClassifierCombiner.main(NERClassifierCombiner.java:382)所以不知道下一步该怎么做。任何其他的组合。
发布于 2015-07-16 22:19:23
在序列化步骤中,您要使用:
edu.stanford.nlp.ie.NERClassifierCombiner
在加载步骤中,您要加载以下内容:
edu.stanford.nlp.ie.crf.CRFClassifier
因此,在第二个命令中,使用edu.stanford.nlp.ie.NERClassifierCombiner代替,错误就会消失。您序列化了一个NERClassifierCombiner,但试图将它作为一个CRFClassifier加载。如果你有其他麻烦,请告诉我!
发布于 2016-03-17 11:19:23
第二个文件c2is2r3.txt需要首先转换为tsv文件,然后将其传递到命令中。
您只需将O(如果您不确定或希望节省手工标记它的时间)与生成的所有令牌关联起来,然后用您的模型进行测试。
https://stackoverflow.com/questions/31460407
复制相似问题