首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Tesseract和多行车牌:如何从两行车牌中获取字符?

Tesseract和多行车牌:如何从两行车牌中获取字符?
EN

Stack Overflow用户
提问于 2021-05-17 22:56:14
回答 2查看 160关注 0票数 0

我尝试从图像中获取单个字符,并通过ocr传递它们,但结果是字符混乱。传递整个图像至少是按顺序返回字符,但似乎ocr也在尝试读取所有其他轮廓。

示例图片:Image being used

结果: 6A7J7B0

期望的结果: AJB6779

代码

代码语言:javascript
复制
img = cv2.imread("data/images/car6.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
# resize image to three times as large as original for better readability
gray = cv2.resize(gray, None, fx = 3, fy = 3, interpolation = cv2.INTER_CUBIC)
# perform gaussian blur to smoothen image
blur = cv2.GaussianBlur(gray, (5,5), 0)

# threshold the image using Otsus method to preprocess for tesseract
ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)

# create rectangular kernel for dilation
rect_kern = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
# apply dilation to make regions more clear
dilation = cv2.dilate(thresh, rect_kern, iterations = 1)

# find contours of regions of interest within license plate
try:
    contours, hierarchy = cv2.findContours(dilation, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
except:
    ret_img, contours, hierarchy = cv2.findContours(dilation, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# sort contours left-to-right
sorted_contours = sorted(contours, key=lambda ctr: cv2.boundingRect(ctr)[0])
# create copy of gray image
im2 = gray.copy()
# create blank string to hold license plate number
plate_num = ""
# loop through contours and find individual letters and numbers in license plate
for cnt in sorted_contours:
    x,y,w,h = cv2.boundingRect(cnt)
    height, width = im2.shape
    # if height of box is not tall enough relative to total height then skip
    if height / float(h) > 6: continue

    ratio = h / float(w)
    # if height to width ratio is less than 1.5 skip
    if ratio < 1.5: continue

    # if width is not wide enough relative to total width then skip
    if width / float(w) > 15: continue

    area = h * w
    # if area is less than 100 pixels skip
    if area < 100: continue

    # draw the rectangle
    rect = cv2.rectangle(im2, (x,y), (x+w, y+h), (0,255,0),2)
    # grab character region of image
    roi = thresh[y-5:y+h+5, x-5:x+w+5]
    # perfrom bitwise not to flip image to black text on white background
    roi = cv2.bitwise_not(roi)
    # perform another blur on character region
    roi = cv2.medianBlur(roi, 5)
    try:
        text = pytesseract.image_to_string(roi, config='-c tessedit_char_whitelist=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ --psm 8 --oem 3')
        # clean tesseract text by removing any unwanted blank spaces
        clean_text = re.sub('[\W_]+', '', text)
        plate_num += clean_text
    except:
        text = None
if plate_num != None:
    print("License Plate #: ", plate_num)
EN

回答 2

Stack Overflow用户

发布于 2021-05-26 11:42:22

对我来说,psm模式11也能够检测单行和多行

代码语言:javascript
复制
pytesseract.image_to_string(img, lang='eng', config='--oem 3 --psm 11').replace("\n", ""))

11稀疏文本。尽可能多地查找没有特定顺序的文本。

票数 0
EN

Stack Overflow用户

发布于 2021-06-22 19:14:26

如果您想从两行中提取车牌号码,您可以替换以下行:

代码语言:javascript
复制
sorted_contours = sorted(contours, key=lambda ctr: cv2.boundingRect(ctr)[0] + cv2.boundingRect(ctr)[1] * img.shape[1] )

使用

代码语言:javascript
复制
sorted_contours = sorted(contours, key=lambda ctr: cv2.boundingRect(ctr)[0])
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/67572127

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档