首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用Python实现浏览器上的图像识别

用Python实现浏览器上的图像识别
EN

Stack Overflow用户
提问于 2018-09-21 20:06:03
回答 1查看 1.9K关注 0票数 1

我想实现一个软件的计数卡在二十一点,使用一些图像识别,以自动化的过程。但我不知道从何说起。我认为问题可以分为以下几个步骤:

1-在游戏中从浏览器中获取图像(它基本上是一个Adobe Flash游戏)

2-处理图像,进行一些图像识别,识别所有的卡片。

3-使用策略Hi-Lo更新计数器

4-在屏幕上显示结果

我如何使用python来做这件事?可以帮助我的库有哪些?这对我来说是一个全新的领域。我会试着根据你的建议来实现这个问题。

编辑1:

Selenium Webdriver运行得很好,到目前为止,我已经用这个简单的代码获得了主页面的截图,但我不能进入游戏,因为我没有钱玩lol:

代码语言:javascript
复制
from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.888casino.it/giochi-da-casino/')
browser.save_screenshot('screenie.png')
browser.quit()

但基本上,我需要用与浏览器挂钩的东西替换browser.get(),而不是用来打开新页面的东西。然后我需要实现一个for循环,它在我玩游戏的时候每秒截取屏幕截图,然后我就可以开始处理这些图像了。

编辑2:

我将尝试使用TensorFlow API进行图像处理,但我没有找到任何用于识别卡片的训练模型。所以我必须创建一个全新的模型,我找到了这个tutorial,它可以帮助我训练自己的对象识别模型。如果您知道现有的培训模型,请将其链接。

编辑3:

使用Tensorflow,我能够创建自己的对象识别模型,现在我需要在python脚本中使用该模型。现在,我使用了这个示例脚本,它打开一个图像并在卡片周围绘制矩形。

代码语言:javascript
复制
import os
import cv2
import numpy as np
import tensorflow as tf
import sys

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")

# Import utilites
from utils import label_map_util
from utils import visualization_utils as vis_util

# Name of the directory containing the object detection module we're using
MODEL_NAME = 'inference_graph'
IMAGE_NAME = 'test1.jpg'

# Grab path to current working directory
CWD_PATH = os.getcwd()

# Path to frozen detection graph .pb file, which contains the model that is used
# for object detection.
PATH_TO_CKPT = os.path.join(CWD_PATH,MODEL_NAME,'frozen_inference_graph.pb')

# Path to label map file
PATH_TO_LABELS = os.path.join(CWD_PATH,'training','labelmap.pbtxt')

# Path to image
PATH_TO_IMAGE = os.path.join(CWD_PATH,IMAGE_NAME)

# Number of classes the object detector can identify
NUM_CLASSES = 13

# Load the label map.
# Label maps map indices to category names, so that when our convolution
# network predicts `5`, we know that this corresponds to `king`.
# Here we use internal utility functions, but anything that returns a
# dictionary mapping integers to appropriate string labels would be fine
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

# Load the Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

    sess = tf.Session(graph=detection_graph)

# Define input and output tensors (i.e. data) for the object detection classifier

# Input tensor is the image
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')

# Output tensors are the detection boxes, scores, and classes
# Each box represents a part of the image where a particular object was detected
detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')

# Each score represents level of confidence for each of the objects.
# The score is shown on the result image, together with the class label.
detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')

# Number of objects detected
num_detections = detection_graph.get_tensor_by_name('num_detections:0')

# Load image using OpenCV and
# expand image dimensions to have shape: [1, None, None, 3]
# i.e. a single-column array, where each item in the column has the pixel RGB value
image = cv2.imread(PATH_TO_IMAGE)
image_expanded = np.expand_dims(image, axis=0)

# Perform the actual detection by running the model with the image as input
(boxes, scores, classes, num) = sess.run(
    [detection_boxes, detection_scores, detection_classes, num_detections],
    feed_dict={image_tensor: image_expanded})

# Draw the results of the detection (aka 'visulaize the results')

vis_util.visualize_boxes_and_labels_on_image_array(
    image,
    np.squeeze(boxes),
    np.squeeze(classes).astype(np.int32),
    np.squeeze(scores),
    category_index,
    use_normalized_coordinates=True,
    line_thickness=8,
    min_score_thresh=0.80)

# All the results have been drawn on image. Now display the image.
cv2.imshow('Object detector', image)

# Press any key to close the image
cv2.waitKey(0)

# Clean up
cv2.destroyAllWindows()

现在,我需要创建自己的脚本来识别卡片,并为每张卡片更新一个必须在屏幕上显示的计数器。这是最棘手的部分,因为我不知道从哪里开始。我在这一步有几个问题,首先脚本必须能够区分离开纸牌的牌和新的牌,这样它就不会在每次截图时都弄乱计数器。其次,对于高牌(10-A),计数器应该更新为-1,对于低牌(2-6)应该更新为+1,对于中性牌(7-8-9)应该更新为0,并且必须在屏幕上可见。

编辑4:我已经构建了软件的第一个版本,但是有一些问题,计数器不能正确更新。代码如下:

代码语言:javascript
复制
import pyscreenshot as ImageGrab
from win32api import GetSystemMetrics
import os
import cv2
import numpy as np
import tensorflow as tf
import sys

import warnings
import h5py

def UpdateCounter(labels, c):
    for i in labels:
        if labels['ace'] > 0:
            c = c - 1
        if labels['king'] > 0:
            c = c - 1
        if labels['queen'] > 0:
            c = c - 1
        if labels['jack'] > 0:
            c = c - 1
        if labels['ten'] > 0:
            c = c - 1
        if labels['six'] > 0:
            c = c + 1
        if labels['five'] > 0:
            c = c + 1
        if labels['four'] > 0:
            c = c + 1
        if labels['three'] > 0:
            c = c + 1
        if labels['two'] > 0:
            c = c + 1
        return c

if __name__ == '__main__':


    sys.path.append("..")
    from utils import label_map_util
    from utils import visualization_utils as vis_util

    MODEL_NAME = 'inference_graph'
    IMAGE_NAME = 'test1.jpg'
    CWD_PATH = os.getcwd()
    PATH_TO_CKPT = os.path.join(CWD_PATH,MODEL_NAME,'frozen_inference_graph.pb')

    PATH_TO_LABELS = os.path.join(CWD_PATH,'training','labelmap.pbtxt')
    PATH_TO_IMAGE = os.path.join(CWD_PATH,IMAGE_NAME)
    os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
    NUM_CLASSES = 13

    label_map = label_map_util.load_labelmap(PATH_TO_LABELS)

    categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)

    category_index = label_map_util.create_category_index(categories)

    detection_graph = tf.Graph()

    with detection_graph.as_default():

        od_graph_def = tf.GraphDef()

        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
            serialized_graph = fid.read()
            od_graph_def.ParseFromString(serialized_graph)
            tf.import_graph_def(od_graph_def, name='')
            sess = tf.Session(graph=detection_graph)

    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')

    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')

    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')

    num_detections = detection_graph.get_tensor_by_name('num_detections:0')

    c = 0


    while True:
        labels = {"ace" : 0, "king": 0, "queen": 0, "jack": 0, "ten": 0, "nine": 0, "eight": 0,"seven": 0, "six": 0, "five": 0, "four":0, "three": 0, "two": 0}

        with warnings.catch_warnings():
            warnings.filterwarnings("ignore",category=FutureWarning)
            screenshot=ImageGrab.grab(bbox=(42,42, GetSystemMetrics(0),GetSystemMetrics(1)))
            screenshot.save(IMAGE_NAME)


        image = cv2.imread(PATH_TO_IMAGE)
        image_expanded = np.expand_dims(image, axis=0)

        (boxes, scores, classes, num) = sess.run(
            [detection_boxes, detection_scores, detection_classes, num_detections],
            feed_dict={image_tensor: image_expanded})

        data = [category_index.get(value) for index,value in enumerate(classes[0]) if scores[0,index] > 0.9]



        for ch in data:
                        if ch['name'] == "ace":
                            labels["ace"] += 1
                        elif ch['name'] == "king":
                            labels["king"] += 1
                        elif ch['name'] == "queen":
                            labels["queen"] += 1
                        elif ch['name'] == "jack":
                            labels["jack"] += 1
                        elif ch['name'] == "ten":
                            labels["ten"] += 1
                        elif ch['name'] == "nine":
                            labels["nine"] += 1
                        elif ch['name'] == "eight":
                            labels["eight"] += 1
                        elif ch['name'] == "seven":
                            labels["seven"] += 1
                        elif ch['name'] == "six":
                            labels["six"] += 1
                        elif ch['name'] == "five":
                            labels["five"] += 1
                        elif ch['name'] == "four":
                            labels["four"] += 1
                        elif ch['name'] == "three":
                            labels["three"] += 1
                        elif ch['name'] == "two":
                            labels["two"] += 1

        print(UpdateCounter(labels, c))

请告诉我怎么解决这个问题?我只需要在新的卡片被识别时才显示计数器,我还需要修复程序得到的糟糕的匹配。

EN

回答 1

Stack Overflow用户

发布于 2018-09-23 00:03:21

我相信您可以通过使用您提到的selenium来实现这一点。

它大概是这样的:

代码语言:javascript
复制
from selenium import webdriver
import time

browser = webdriver.Chrome()
browser.get('https://www.888casino.it/giochi-da-casino/')

while True:
    browser.save_screenshot('screenie.png')
    #do the image processing...
    time.sleep(1)

browser.quit()

对于图像处理本身,您将面临识别图像上所需元素的问题,在您的情况下,卡片将进一步单独处理每个元素。所以在这方面你有一个两步的任务。

有一个tensorflow对象检测API可能会派上用场:https://github.com/opencv/opencv/wiki/TensorFlow-Object-Detection-API

祝好运!

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/52443426

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档