首页
学习
活动
专区
圈层
工具
发布
社区首页 >专栏 >A more effective method for searching for the filling range of the target

A more effective method for searching for the filling range of the target

原创
作者头像
Swing Dunn
发布2025-10-20 11:18:30
发布2025-10-20 11:18:30
1330
举报
文章被收录于专栏:Some studies in imgsSome studies in imgs

Reference: An Algorithm for Identifying Objective Questions on Answer Sheets - Yao Shuli, Wang Shaorong, Gai Meng, Wang Zhen

Appropriate expansion can include the target filling area, but it will inevitably introduce interference factors that affect the accuracy of the final result.From this paper "An Algorithm for Identifying Objective Questions on Answer Sheets - Yao Shuli, Wang Shaorong, Gai Meng, Wang Zhen", we can obtain a more accurate method for marking the filling areas.And it can also accommodate situations where students deviate from the answer options area on the test paper and compatible with the minor positional deviations that have been corrected.

1.Read the image and convert it into a grayscale image

代码语言:txt
复制
    ori_img = cv2.imread(img_path, cv2.IMREAD_ANYCOLOR)
    display_img = ori_img.copy()

    gray = cv2.cvtColor(ori_img, cv2.COLOR_BGR2GRAY)
gray
gray

2.Traverse the coordinates of each option, and appropriately expand the area of the option region as the search window area for the target.

代码语言:txt
复制

    for i in range(len(template_block_pos)):
        temp_block_ltpt = template_block_pos[i]

        ex_block_ltpt = (temp_block_ltpt[0] - 10, temp_block_ltpt[1] - 5)
        ex_block_size = template_block_size[0] + 20, template_block_size[1] + 10
        ex_block_rbpt = ex_block_ltpt[0] + ex_block_size[0], ex_block_ltpt[1] + ex_block_size[1]

        window_img = gray[ex_block_ltpt[1]: ex_block_rbpt[1], ex_block_ltpt[0] :ex_block_rbpt[0]]
        display_window_img  = cv2.cvtColor(window_img, cv2.COLOR_GRAY2BGR)
        #img_show(window_img)
search window
search window

3.The smaller the pixels within the area, the higher the overall filling rate.Based on this, we can simultaneously perform the search for the minimum pixel sum in the x-direction and y-direction, with the sliding window size being the size of the option blocks provided by the template.

代码语言:txt
复制
        block_width = template_block_size[0]
        block_height = template_block_size[1]

31.Search for pixels in the X direction and the smallest range

代码语言:txt
复制
        max_sumpixels = sys.maxsize
        h_targer_start = 0
        for col in range(0, ex_block_size[0] - block_width):
            h_target_img = window_img[: , col : col + block_width]
            sum_pixls = np.sum(h_target_img)
            if sum_pixls < max_sumpixels:
                max_sumpixels = sum_pixls
                h_targer_start = col

        cv2.rectangle(display_window_img,(h_targer_start,0) ,(h_targer_start + block_width, ex_block_size[1]), (0,255,255),2)
        #display_img[ex_block_ltpt[1]: ex_block_rbpt[1],ex_block_ltpt[0]: ex_block_rbpt[0]] = display_window_img            
        #img_show(display_window_img)   

The result of the single-option search for the target area in the X direction(yellow range):

target area in X_direction
target area in X_direction

The result of all-options search for the target area in the X direction:

target area in X_direction
target area in X_direction

3.2.Search for pixels in the Y direction and the smallest range

代码语言:txt
复制
        max_sumpixels = sys.maxsize
        v_target_start = 0
        for row in range(0, ex_block_size[1] - block_height):
            v_target_img = window_img[row : row + block_height, h_targer_start : h_targer_start + block_width]
            sum_pixls = np.sum(v_target_img)
            if sum_pixls < max_sumpixels:
                max_sumpixels = sum_pixls
                v_target_start = row

        cv2.rectangle(display_window_img,(h_targer_start,v_target_start) ,(h_targer_start + block_width, v_target_start + block_height), (255,0,0),1)
        #img_show(display_window_img)
        #display_img[ex_block_ltpt[1]: ex_block_rbpt[1],ex_block_ltpt[0]: ex_block_rbpt[0]] = display_window_img

The result of the single-option search for the target area in the X direction(blue range):

target area in Y-direction
target area in Y-direction

The result of all-options search for the target area in the X direction:

target area in Y-direction
target area in Y-direction

4.In this way, we used the coordinates and ranges provided in the template to identify a general target area.

代码语言:txt
复制
 rough_valid_filling_area = window_img[v_target_start : v_target_start + block_height , h_targer_start : h_targer_start + block_width]

However, due to some errors, there might still be a few white borders around the target area. Removing these borders can further enhance the accuracy of the target area.

Calculate the average pixel value of the target area, and then successively reduce the boundary in each direction. If the average pixel value of the reduced target area is lower than that of the previous one, it indicates that the reduced target area is overall darker and the operation of narrowing the boundary is effective; otherwise, stop the operation of narrowing the boundary.

代码语言:txt
复制
gray_average = np.sum(rough_valid_filling_area) / rough_valid_filling_area.size

        #边缘上的空白边框宽度
        l_scaled_level = 0
        t_scaled_level = 0
        r_scaled_level = 0
        b_scaled_level = 0
        #去除左边界的空白边缘
        for i in  range(0,2):
            target_range_height = rough_valid_filling_area.shape[0]
            target_range_width = rough_valid_filling_area.shape[1]
            l_scaled_img = rough_valid_filling_area[0: target_range_height, 1: target_range_width]
            l_scaled_img_average = np.sum(l_scaled_img) / l_scaled_img.size
            if(l_scaled_img_average < gray_average):
                gray_average = l_scaled_img_average
                rough_valid_filling_area = l_scaled_img.copy()
                l_scaled_level = i

        #去除上边界的空白边缘
        for i in  range(0,2):
            target_range_height = rough_valid_filling_area.shape[0]
            target_range_width = rough_valid_filling_area.shape[1]
            l_scaled_img = rough_valid_filling_area[1: target_range_height, 0: target_range_width]
            l_scaled_img_average = np.sum(l_scaled_img) / l_scaled_img.size
            if(l_scaled_img_average < gray_average):
                gray_average = l_scaled_img_average
                rough_valid_filling_area = l_scaled_img.copy()
                t_scaled_level = i

        #去除右边界的空白边缘
        for i in  range(0,2):
            target_range_height = rough_valid_filling_area.shape[0]
            target_range_width = rough_valid_filling_area.shape[1]
            l_scaled_img = rough_valid_filling_area[0: target_range_height, 0: target_range_width - 1]
            l_scaled_img_average = np.sum(l_scaled_img) / l_scaled_img.size
            if(l_scaled_img_average < gray_average):
                gray_average = l_scaled_img_average
                rough_valid_filling_area = l_scaled_img.copy()
                r_scaled_level = i

        #去除下边界的空白边缘
        for i in  range(0,2):
            target_range_height = rough_valid_filling_area.shape[0]
            target_range_width = rough_valid_filling_area.shape[1]
            l_scaled_img = rough_valid_filling_area[0: target_range_height - 1, 0: target_range_width]
            l_scaled_img_average = np.sum(l_scaled_img) / l_scaled_img.size
            if(l_scaled_img_average < gray_average):
                gray_average = l_scaled_img_average
                rough_valid_filling_area = l_scaled_img.copy()
                b_scaled_level = i
                
        final_start_x = h_targer_start + l_scaled_level
        final_start_y = v_target_start + t_scaled_level

        final_end_x = h_targer_start + block_width - r_scaled_level
        finale_end_y = v_target_start + block_height - b_scaled_level

Effective reduction of margin for single-choice options:

Reduce the white border
Reduce the white border

The whole page:

Reduce the white border
Reduce the white border

5.Draw the final search results.

代码语言:txt
复制
cv2.rectangle(ori_img,(ex_block_ltpt[0] +final_start_x,ex_block_ltpt[1] + final_start_y) ,(ex_block_ltpt[0] + final_end_x, ex_block_ltpt[1]+ finale_end_y), (0,0,255),1)
final result
final result

source code:

代码语言:txt
复制
import cv2
import sys
import numpy as np
import matplotlib.pyplot as plt

img_path = './img7/1.jpg'
template_block_pos = [(30,11), (77,11), (124, 11), (172, 11),
                      (30,41), (77,41), (124, 41), (172, 41),
                      (30,68), (77,68), (124, 68), (172, 68),
                      (30,97), (77,97), (124, 97), (172, 97),
                      (30,126), (77,126), (124, 126), (172, 126)]
template_block_size =(30,18)

def img_show(img):
    cv2.namedWindow("default", cv2.WINDOW_FREERATIO)
    cv2.imshow("default", img)
    cv2.waitKey(0)
    cv2.destroyWindow("default")

def main():
    ori_img = cv2.imread(img_path, cv2.IMREAD_ANYCOLOR)
    display_img = ori_img.copy()

    gray = cv2.cvtColor(ori_img, cv2.COLOR_BGR2GRAY)

    #img_show(gray)

    # 二值化处理,使用OTSU自动阈值
    g_threshold, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
    #img_show(binary)

    for i in range(len(template_block_pos)):
        temp_block_ltpt = template_block_pos[i]

        ex_block_ltpt = (temp_block_ltpt[0] - 10, temp_block_ltpt[1] - 5)
        ex_block_size = template_block_size[0] + 20, template_block_size[1] + 10
        ex_block_rbpt = ex_block_ltpt[0] + ex_block_size[0], ex_block_ltpt[1] + ex_block_size[1]

        window_img = gray[ex_block_ltpt[1]: ex_block_rbpt[1], ex_block_ltpt[0] :ex_block_rbpt[0]]
        display_window_img  = cv2.cvtColor(window_img, cv2.COLOR_GRAY2BGR)
        #img_show(window_img)

        block_width = template_block_size[0]
        block_height = template_block_size[1]

        max_sumpixels = sys.maxsize
        h_targer_start = 0
        for col in range(0, ex_block_size[0] - block_width):
            h_target_img = window_img[: , col : col + block_width]
            sum_pixls = np.sum(h_target_img)
            if sum_pixls < max_sumpixels:
                max_sumpixels = sum_pixls
                h_targer_start = col

        cv2.rectangle(display_window_img,(h_targer_start,0) ,(h_targer_start + block_width, ex_block_size[1]), (0,255,255),2)
        display_img[ex_block_ltpt[1]: ex_block_rbpt[1],ex_block_ltpt[0]: ex_block_rbpt[0]] = display_window_img            
        #img_show(display_window_img)

        max_sumpixels = sys.maxsize
        v_target_start = 0
        for row in range(0, ex_block_size[1] - block_height):
            v_target_img = window_img[row : row + block_height, h_targer_start : h_targer_start + block_width]
            sum_pixls = np.sum(v_target_img)
            if sum_pixls < max_sumpixels:
                max_sumpixels = sum_pixls
                v_target_start = row

        cv2.rectangle(display_window_img,(h_targer_start,v_target_start) ,(h_targer_start + block_width, v_target_start + block_height), (255,0,0),1)
        #img_show(display_window_img)
        display_img[ex_block_ltpt[1]: ex_block_rbpt[1],ex_block_ltpt[0]: ex_block_rbpt[0]] = display_window_img

        rough_valid_filling_area = window_img[v_target_start : v_target_start + block_height , h_targer_start : h_targer_start + block_width]
       #img_show(rough_valid_filling_area)

        gray_average = np.sum(rough_valid_filling_area) / rough_valid_filling_area.size

        #边缘上的空白边框宽度
        l_scaled_level = 0
        t_scaled_level = 0
        r_scaled_level = 0
        b_scaled_level = 0
        #去除左边界的空白边缘
        for i in  range(0,2):
            target_range_height = rough_valid_filling_area.shape[0]
            target_range_width = rough_valid_filling_area.shape[1]
            l_scaled_img = rough_valid_filling_area[0: target_range_height, 1: target_range_width]
            l_scaled_img_average = np.sum(l_scaled_img) / l_scaled_img.size
            if(l_scaled_img_average < gray_average):
                gray_average = l_scaled_img_average
                rough_valid_filling_area = l_scaled_img.copy()
                l_scaled_level = i

        #去除上边界的空白边缘
        for i in  range(0,2):
            target_range_height = rough_valid_filling_area.shape[0]
            target_range_width = rough_valid_filling_area.shape[1]
            l_scaled_img = rough_valid_filling_area[1: target_range_height, 0: target_range_width]
            l_scaled_img_average = np.sum(l_scaled_img) / l_scaled_img.size
            if(l_scaled_img_average < gray_average):
                gray_average = l_scaled_img_average
                rough_valid_filling_area = l_scaled_img.copy()
                t_scaled_level = i

        #去除右边界的空白边缘
        for i in  range(0,2):
            target_range_height = rough_valid_filling_area.shape[0]
            target_range_width = rough_valid_filling_area.shape[1]
            l_scaled_img = rough_valid_filling_area[0: target_range_height, 0: target_range_width - 1]
            l_scaled_img_average = np.sum(l_scaled_img) / l_scaled_img.size
            if(l_scaled_img_average < gray_average):
                gray_average = l_scaled_img_average
                rough_valid_filling_area = l_scaled_img.copy()
                r_scaled_level = i

        #去除下边界的空白边缘
        for i in  range(0,2):
            target_range_height = rough_valid_filling_area.shape[0]
            target_range_width = rough_valid_filling_area.shape[1]
            l_scaled_img = rough_valid_filling_area[0: target_range_height - 1, 0: target_range_width]
            l_scaled_img_average = np.sum(l_scaled_img) / l_scaled_img.size
            if(l_scaled_img_average < gray_average):
                gray_average = l_scaled_img_average
                rough_valid_filling_area = l_scaled_img.copy()
                b_scaled_level = i
                
        final_start_x = h_targer_start + l_scaled_level
        final_start_y = v_target_start + t_scaled_level

        final_end_x = h_targer_start + block_width - r_scaled_level
        finale_end_y = v_target_start + block_height - b_scaled_level
        cv2.rectangle(display_window_img,(final_start_x,final_start_y) ,(final_end_x, finale_end_y), (0,0,255),1)
        display_img[ex_block_ltpt[1]: ex_block_rbpt[1],ex_block_ltpt[0]: ex_block_rbpt[0]] = display_window_img
        #img_show(display_window_img)
        cv2.rectangle(ori_img,(ex_block_ltpt[0] +final_start_x,ex_block_ltpt[1] + final_start_y) ,(ex_block_ltpt[0] + final_end_x, ex_block_ltpt[1]+ finale_end_y), (0,0,255),1)

        
    img_show(ori_img)

if __name__ == '__main__':
    main()

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • source code:
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档