我跟随这个网站的代码:
https://blog.luisfred.com.br/reconhecimento-de-escrita-manual-com-redes-neurais-convolucionais/下面是网站经过的代码:
from keras. datasets import mnist
from keras. models import Sequential
from keras. layers import Dense
from keras. layers import Dropout
from keras. layers import Flatten
import numpy as np
from matplotlib import pyplot as plt
from keras. layers . convolutional import Conv2D
from keras. layers . convolutional import MaxPooling2D
from keras. utils import np_utils
from keras import backend as K
K . set_image_dim_ordering ( 'th' )
import cv2
import matplotlib. pyplot as plt
#% inline matplotlib # If you are using Jupyter, it will be useful for plotting graphics or figures inside cells
#Divided the data into subsets of training and testing.
( X_train , y_train ) , ( X_test , y_test ) = mnist. load_data ( )
# Since we are working in gray scale we can
# set the depth to the value 1.
X_train = X_train . reshape ( X_train . shape [ 0 ] , 1 , 28 , 28 ) . astype ( 'float32' )
X_test = X_test . reshape ( X_test . shape [ 0 ] , 1 , 28 , 28 ) . astype ( 'float32' )
# We normalize our data according to the
# gray scale. The floating point values are in the range [0,1], instead of [.255]
X_train = X_train / 255
X_test = X_test / 255
# Converts y_train and y_test, which are class vectors, to a binary class array (one-hot vectors)
y_train = np_utils. to_categorical ( y_train )
y_test = np_utils. to_categorical ( y_test )
# Number of digit types found in MNIST. In this case, the value is 10, corresponding to (0,1,2,3,4,5,6,7,8,9).
num_classes = y_test. shape [ 1 ]
def deeper_cnn_model ( ) :
model = Sequential ( )
# Convolution2D will be our input layer. We can observe that it has
# 30 feature maps with size of 5 × 5 and an activation function of type ReLU.
model.add ( Conv2D ( 30 , ( 5 , 5 ) , input_shape = ( 1 , 28 , 28 ) , activation = 'relu' ) )
# The MaxPooling2D layer will be our second layer where we will have a sample window of size 2 x 2
model.add ( MaxPooling2D ( pool_size = ( 2 , 2 ) ) )
# A new convolutional layer, with 15 feature maps of size 3 × 3, and activation function ReLU
model.add ( Conv2D ( 15 , ( 3 , 3 ) , activation = 'relu' ) )
# A new subsampling with a 2x2 dimension pooling.
model.add ( MaxPooling2D ( pool_size = ( 2 , 2 ) ) )
# We include a dropout with a 20% probability (you can try other values)
model.add ( Dropout ( 0.2 ) )
# We need to convert the output of the convolutional layer, so that it can be used as input to the densely connected layer that is next.
# What this does is "flatten / flatten" the structure of the output of the convolutional layers, creating a single long vector of features
# that will be used by the Fully Connected layer.
model.add ( Flatten ( ) )
# Fully connected layer with 128 neurons.
model.add ( Dense ( 128 , activation = 'relu' ) )
# Followed by a new fully connected layer with 64 neurons
model.add ( Dense ( 64 , activation = 'relu' ) )
# Followed by a new fully connected layer with 32 neurons
model.add ( Dense ( 32 , activation = 'relu' ) )
# The output layer has the number of neurons compatible with the
# number of classes to be obtained. Notice that we are using a softmax activation function,
model.add ( Dense ( num_classes, activation = 'softmax' , name = 'preds' ) )
# Configure the entire training process of the neural network
model.compile ( loss = 'categorical_crossentropy' , optimizer = 'adam' , metrics = [ 'accuracy' ] )
return model
model = deeper_cnn_model ( )
model.summary ( )
model.fit ( X_train , y_train, validation_data = ( X_test , y_test ) , epochs = 10 , batch_size = 200 )
scores = model. evaluate ( X_test , y_test, verbose = 0 )
print ( "\ nacc:% .2f %%" % (scores [1] * 100))
###enhance to check multiple numbers after the training is done
img_pred = cv2. imread ( 'five.JPG' , 0 )
plt.imshow(img_pred, cmap='gray')
# forces the image to have the input dimensions equal to those used in the training data (28x28)
if img_pred. shape != [ 28 , 28 ] :
img2 = cv2. resize ( img_pred, ( 28 , 28 ) )
img_pred = img2. reshape ( 28 , 28 , - 1 ) ;
else :
img_pred = img_pred. reshape ( 28 , 28 , - 1 ) ;
# here also we inform the value for the depth = 1, number of rows and columns, which correspond 28x28 of the image.
img_pred = img_pred. reshape ( 1 , 1 , 28 , 28 )
pred = model. predict_classes ( img_pred )
pred_proba = model. predict_proba ( img_pred )
pred_proba = "% .2f %%" % (pred_proba [0] [pred] * 100)
print ( pred [ 0 ] , "with probability of" , pred_proba )最后,我试图对我绘制和导入的数字5进行预测(我也尝试过用其他手工绘制的数字,结果也很差):
img_pred = cv2. imread ( 'five.JPG' , 0 )
plt.imshow(img_pred, cmap='gray')
# forces the image to have the input dimensions equal to those used in the training data (28x28)
if img_pred. shape != [ 28 , 28 ] :
img2 = cv2. resize ( img_pred, ( 28 , 28 ) )
img_pred = img2. reshape ( 28 , 28 , - 1 ) ;
else :
img_pred = img_pred. reshape ( 28 , 28 , - 1 ) ;
# here also we inform the value for the depth = 1, number of rows and columns, which correspond 28x28 of the image.
img_pred = img_pred. reshape ( 1 , 1 , 28 , 28 )
pred = model. predict_classes ( img_pred )
pred_proba = model. predict_proba ( img_pred )
pred_proba = "% .2f %%" % (pred_proba [0] [pred] * 100)
print ( pred [ 0 ] , "with probability of" , pred_proba )下面介绍一下five.jpg:
但是当我输入我自己的数字时,模型预测是错误的。有没有想过为什么会这样?我承认我对ML还不熟悉,而且刚刚开始尝试。我的想法也许是图像的中心化,或者图像的正常化已经取消?任何帮助都是非常感谢的!
Edit1:
MNIST测试编号将如下所示:
发布于 2018-04-09 22:05:49
看起来你有两个问题,正如你所怀疑的,这与数据的预处理有关。
首先,您的图像相对于培训数据是倒置的:
img_pred = cv2. imread ( 'five.JPG' , 0 )后,背景像素接近白色,其值在215-238附近。X_train中的培训数据,背景像素全部为零,数字为白色或近白色(上端210-255)。试着在X_train中的一些选择旁边画出你的图像,你会发现它们是倒置的。
另一个问题是,cv2.resize()中的默认插值不保留数据的缩放。调整数据大小后,最小值将跳转到60,而不是0。比较重新缩放步骤前后的img.pred.min()和img.pred.max()值。
您可以反转和缩放数据,使其看起来更像MNIST输入数据,其函数如下:
def mnist_bytescale(image):
# Use float for rescaling
img_temp = image.astype(np.float32)
#Re-zero the data
img_temp -= img_temp.min()
#Re-scale and invert
img_temp /= (img_temp.max()-img_temp.min())
img_temp *= 255
return 255 - img_temp.astype('uint')这将翻转您的数据,并将其从0线性缩放到255,就像网络正在培训的数据一样。但是,如果您绘制mnist_bytescale(img_pred),您将注意到大多数像素中的背景级别仍然不是0,因为原始图像的背景级别不是常数(可能是由于JPEG压缩所致)。如果您的网络仍然存在这种翻转和缩放数据的问题,您可以尝试使用np.clip将背景级别归零,看看这是否有帮助。
https://stackoverflow.com/questions/49741671
复制相似问题