我和Paddle一起工作,我想知道bbx的输出格式是什么?我在Paddle的github找不到。这是我的密码。
from paddleocr import PaddleOCR,draw_ocr
ocr = PaddleOCR(use_angle_cls=False, lang='en', rec=False) # need to run only once to download and load model into memory
result = ocr.ocr(img, cls=False)输出
[[[[8.0, 12.0], [89.0, 12.0], [89.0, 25.0], [8.0, 25.0]],
('@kheengz_yfk', 0.9460259079933167)],
[[[6.0, 31.0], [227.0, 29.0], [227.0, 44.0], [6.0, 46.0]],
('EBIT is a week old today. and', 0.847086489200592)],
[[[4.0, 47.0], [225.0, 49.0], [225.0, 64.0], [4.0, 62.0]],
('the homebors came together...Seemore', 0.942597508430481)],
[[[7.0, 70.0], [183.0, 70.0], [183.0, 83.0], [7.0, 83.0]],
('Joriginal sound-kheengz_yfk(Cont', 0.8839073181152344)]]我想手动绘制边框对它。
我的想法是,首先是x0,y0(左上角),最后是x1,y1(右)
rect = cv2.rectangle(img.copy(), (int(result[0][0][0][0]), int(result[0][0][0][1])), (int(result[0][0][-1][0]),int(result[0][0][-1][0]) ), (0, 255, 0), -1)
plt.imshow(rect)但这是不正确的。在这方面有任何帮助。谢谢。
测试图像。

原始桨OCR draw_ocr输出
from PIL import Image
image = Image.fromarray(img).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/usr/share/fonts/opentype/malayalam/Chilanka-Regular.otf')
plt.imshow(im_show)

发布于 2022-07-28 17:47:53
您可以轻松地将该框格式化为xmin、ymin、xmax、ymax以供友好使用。
for box in result:
box = np.array(box).astype(np.int32)
xmin = min(box[:, 0])
ymin = min(box[:, 1])
xmax = max(box[:, 0])
ymax = max(box[:, 1])
crop = mat[ymin:ymax, xmin:xmax]https://stackoverflow.com/questions/72893442
复制相似问题