我正在使用tesseract ocr为我的最后一年项目训练一门新语言。
我从我的词汇表中创建了word-dawg。但是,无论我是否包含单词-dawg和单词列表,combine_tessdata结果都是相同的。所以我不确定我的word-dawg和wordlist是否包含在我的训练数据中。
输出如下:类型0偏移量是-1类型1的偏移量是140类型2的偏移量是3726类型3的偏移量是3904类型4的偏移量是346848类型5的偏移量是347329类型6的偏移量是347329类型7的偏移量是-1类型8的偏移量是-1类型10的偏移量是-1类型11的偏移量是-1类型12的偏移量是354078类型14的偏移量是-1类型15的偏移量是-1类型16的偏移量是-1
我相信偏移量2是用于单音歧义的。你知道哪个偏移量是word-dawg的吗?其余的偏移量呢?
发布于 2016-02-28 08:05:04
可能是文件名问题。以下是我的训练结果。-1表示文件不存在。
Combining tessdata files
Output vie.traineddata created sucessfully.
TessdataManager combined tesseract data files.
Offset for type 0 (vie.config ) is -1
Offset for type 1 (vie.unicharset ) is 140
Offset for type 2 (vie.unicharambigs ) is 15877
Offset for type 3 (vie.inttemp ) is 21397
Offset for type 4 (vie.pffmtable ) is 1466247
Offset for type 5 (vie.normproto ) is 1468147
Offset for type 6 (vie.punc-dawg ) is -1
Offset for type 7 (vie.word-dawg ) is 1513182
Offset for type 8 (vie.number-dawg ) is -1
Offset for type 9 (vie.freq-dawg ) is 1589568
Offset for type 10 (vie.fixed-length-dawgs ) is -1
Offset for type 11 (vie.cube-unicharset ) is -1
Offset for type 12 (vie.cube-word-dawg ) is -1
Offset for type 13 (vie.shapetable ) is 1594178
Offset for type 14 (vie.bigram-dawg ) is -1
Offset for type 15 (vie.unambig-dawg ) is -1
Offset for type 16 (vie.params-training-model ) is -1https://stackoverflow.com/questions/35636303
复制相似问题