问Tesseract OCR word-dawg不包含在combine_tessdata中
EN

Stack Overflow用户

提问于 2016-02-26 03:16:37

回答 1查看 637关注 0票数 0

我正在使用tesseract ocr为我的最后一年项目训练一门新语言。

我从我的词汇表中创建了word-dawg。但是，无论我是否包含单词-dawg和单词列表，combine_tessdata结果都是相同的。所以我不确定我的word-dawg和wordlist是否包含在我的训练数据中。

输出如下:类型0偏移量是-1类型1的偏移量是140类型2的偏移量是3726类型3的偏移量是3904类型4的偏移量是346848类型5的偏移量是347329类型6的偏移量是347329类型7的偏移量是-1类型8的偏移量是-1类型10的偏移量是-1类型11的偏移量是-1类型12的偏移量是354078类型14的偏移量是-1类型15的偏移量是-1类型16的偏移量是-1

我相信偏移量2是用于单音歧义的。你知道哪个偏移量是word-dawg的吗？其余的偏移量呢？

ocr

tesseract

回答 1

Stack Overflow用户

发布于 2016-02-28 08:05:04

可能是文件名问题。以下是我的训练结果。-1表示文件不存在。

Combining tessdata files
Output vie.traineddata created sucessfully.
TessdataManager combined tesseract data files.
Offset for type  0 (vie.config                ) is -1
Offset for type  1 (vie.unicharset            ) is 140
Offset for type  2 (vie.unicharambigs         ) is 15877
Offset for type  3 (vie.inttemp               ) is 21397
Offset for type  4 (vie.pffmtable             ) is 1466247
Offset for type  5 (vie.normproto             ) is 1468147
Offset for type  6 (vie.punc-dawg             ) is -1
Offset for type  7 (vie.word-dawg             ) is 1513182
Offset for type  8 (vie.number-dawg           ) is -1
Offset for type  9 (vie.freq-dawg             ) is 1589568
Offset for type 10 (vie.fixed-length-dawgs    ) is -1
Offset for type 11 (vie.cube-unicharset       ) is -1
Offset for type 12 (vie.cube-word-dawg        ) is -1
Offset for type 13 (vie.shapetable            ) is 1594178
Offset for type 14 (vie.bigram-dawg           ) is -1
Offset for type 15 (vie.unambig-dawg          ) is -1
Offset for type 16 (vie.params-training-model ) is -1

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/35636303

复制

相似问题

问Tesseract OCR word-dawg不包含在combine_tessdata中
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Tesseract OCR word-dawg不包含在combine_tessdata中EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Tesseract OCR word-dawg不包含在combine_tessdata中
EN