负责从给定图像检索内容的项目,并与存储库中的其他图像进行比较,并列出匹配的图像。
什么应该是正确的方法去做它,这样搜索最终不会放缓。
作为第一层过滤,我计划做的是使用任何图像查询(CBIR技术)来检索与给定图像模式匹配的图像。然后执行OCR以获取图像内容并进行匹配检查。
如果有更好的方法,请告诉我。
发布于 2015-10-08 14:13:04
步骤已完成
软件1. Tesseract OCR 2.图像Magick -用于图像清理3. Text吸尘器脚本
- Convert package has a feature to find the image orientation using the EXIF data which is not that useful.
- For this image was rotated 90 degree thrice and the ocr data for each was compared with the other to find the correct orientation. ( image with maximum number of words wins)
- on success stores the details on DB for future search
- on failure
- Created 10 different images with different filters (gray scale mode and sharpment applied)
- OCRed all images and found out the required data form all the data got.
https://stackoverflow.com/questions/32304261
复制相似问题