首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >为什么Tessnet2光学字符识别结果不好

为什么Tessnet2光学字符识别结果不好
EN

Stack Overflow用户
提问于 2013-03-17 03:22:49
回答 1查看 876关注 0票数 1

我正在使用tessnet2从.tif图像中获取文本。例如,我想要从图像中获得小数'700‘,但我得到了这个:'Mupann’我在这里使用法语tessdata,我使用的代码是:

代码语言:javascript
复制
 ocr.SetVariable("tessedit_char_whitelist", "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz.,$-/#&=()\':?")
        ocr.Init(Application.StartupPath & "\tessdata", "fra", False)
        Dim result As List(Of tessnet2.Word) = ocr.DoOCR(captureTIF, Rectangle.Empty)

        For Each word As tessnet2.Word In result
            MsgBox(word.Confidence & " >> " & word.Text)
            RichTextBox1.Text &= word.Confidence & word.Text
        Next

谢谢

EN

回答 1

Stack Overflow用户

发布于 2013-07-12 00:16:48

我不知道这对你的情况是否有帮助。但是当我使用TessNet2时,我使用了一个布尔函数来确定某些单词。您将看到,我要查找的单词有多个大小写。我确信这是NOTthe最有效的方式,但它也是一种方式。

代码语言:javascript
复制
public Boolean isPageABSTRACTING(List<tessnet2.Word> wordList)
        {

            for (int i = 0; i < wordList.Count; i++) //scan through words
            {
                if ((wordList[i].Text == "Abstracting" || wordList[i].Text == "abstracting" || wordList[i].Text == "abstractmg" || wordList[i].Text == "Abstractmg" && wordList[i].Confidence >= 50) && (wordList[i + 1].Text == "Service" || wordList[i + 1].Text == "service" || wordList[i + 1].Text == "5ervice" && wordList[i + 1].Confidence >= 50) && (wordList[i + 2].Text == "Ordered" || wordList[i + 2].Text == "ordered" && wordList[i + 2].Confidence >= 50)) //find 1st tier check
                {
                    for (int j = 0; j < wordList.Count; j++) //scan through words again
                    {
                        if ((wordList[j].Text == "Due" || wordList[j].Text == "Oue" && wordList[j].Confidence >= 50) && (wordList[j + 1].Text == "Date" || wordList[j + 1].Text == "Oate" && wordList[j + 1].Confidence >= 50) && (wordList[j + 2].Text == "&" && wordList[j + 2].Confidence >= 50)) //find 2nd tier check
                        {
                            for (int h = 0; h < wordList.Count; h++) //scan through words again
                            {
                                if ((wordList[h].Text == "Additional" || wordList[h].Text == "additional" && wordList[h].Confidence >= 50) && (wordList[h + 1].Text == "comments" || wordList[h + 1].Text == "Comments" && wordList[h + 1].Confidence >= 50) && (wordList[h + 2].Text == "about" || wordList[h + 2].Text == "About" && wordList[h + 2].Confidence >= 50) && (wordList[h + 3].Text == "this" || wordList[h + 3].Text == "This" && wordList[h + 3].Confidence >= 50)) //find 3rd tier check
                                {
                                    return true;
                                }
                            }
                        }
                    }
                }
            }

            return false;
        }
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/15453635

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档