根据AlexNet纸的抽象,他们声称有6000万个参数:
神经网络,有6000万个参数和65万个神经元,包括五个卷积层,其中一些是最大池层,三个完全连接的层,最后是1000路软件。
当我用Keras实现这个模型时,我得到了2500万个params。
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(96, 11, strides=4, activation="relu", input_shape=[227,227,3]),
tf.keras.layers.MaxPooling2D(pool_size=(3,3), strides=(2,2)),
tf.keras.layers.Conv2D(256, 5, activation="relu", padding="SAME"),
tf.keras.layers.MaxPooling2D(pool_size=(3,3), strides=(2,2)),
tf.keras.layers.Conv2D(384, 3, activation="relu", padding="SAME"),
tf.keras.layers.Conv2D(384, 3, activation="relu", padding="SAME"),
tf.keras.layers.Conv2D(256, 3, activation="relu", padding="SAME"),
tf.keras.layers.Dense(4096, activation="relu"),
tf.keras.layers.Dense(4096, activation="relu"),
tf.keras.layers.Dense(1000, activation="softmax"),
])请注意,我删除了规范化,并将输入设置为227*227,而不是224*224。请参阅这个问题的细节。
以下是Keras的总结:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 55, 55, 96) 34944
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 27, 27, 96) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 27, 27, 256) 614656
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 256) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 13, 13, 384) 885120
_________________________________________________________________
conv2d_3 (Conv2D) (None, 13, 13, 384) 1327488
_________________________________________________________________
conv2d_4 (Conv2D) (None, 13, 13, 256) 884992
_________________________________________________________________
dense (Dense) (None, 13, 13, 4096) 1052672
_________________________________________________________________
dense_1 (Dense) (None, 13, 13, 4096) 16781312
_________________________________________________________________
dense_2 (Dense) (None, 13, 13, 1000) 4097000
=================================================================
Total params: 25,678,184
Trainable params: 25,678,184
Non-trainable params: 0
_________________________________________________________________我离六千万还很远。那么,他们是如何得到6000万帕拉姆的呢?
作为参考,下面是Sec中描述的模型的体系结构。3.5文件:
第一卷积层过滤尺寸为11x11x3的96个核的224x224x3输入图像,步长为4个像素(这是核映射中相邻神经元的接收场中心之间的距离)。第二卷积层以第一卷积层的(响应归一化和集合)输出作为输入,并将其过滤为256个大小为5x5x48的内核。第三层、第四层和第五层相互连接,没有任何中间的池或规范化层。第三卷积层具有384个大小为3x3x256的内核,其连接到第二卷积层的(归一化、集合)输出。第四卷积层有384个大小为3x3x192的内核,第五个卷积层有256个大小为3x3x192的内核。全连接层各有4096个神经元。
发布于 2020-10-26 09:36:29
我忘了在最后一个Conv2D层和第一个完全连接的层之间平坦.
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(96, 11, strides=4, activation="relu", input_shape=[227,227,3]),
tf.keras.layers.MaxPooling2D(pool_size=(3,3), strides=(2,2)),
tf.keras.layers.Conv2D(256, 5, activation="relu", padding="SAME"),
tf.keras.layers.MaxPooling2D(pool_size=(3,3), strides=(2,2)),
tf.keras.layers.Conv2D(384, 3, activation="relu", padding="SAME"),
tf.keras.layers.Conv2D(384, 3, activation="relu", padding="SAME"),
tf.keras.layers.Conv2D(256, 3, activation="relu", padding="SAME"),
tf.keras.layers.Flatten(), # <-- This layer
tf.keras.layers.Dense(4096, activation="relu"),
tf.keras.layers.Dense(4096, activation="relu"),
tf.keras.layers.Dense(1000, activation="softmax"),
])一旦加起来,我就得到了6200万个帕拉姆斯:
Model: "alex_net"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) multiple 34944
_________________________________________________________________
conv2d_1 (Conv2D) multiple 614656
_________________________________________________________________
conv2d_2 (Conv2D) multiple 885120
_________________________________________________________________
conv2d_3 (Conv2D) multiple 1327488
_________________________________________________________________
conv2d_4 (Conv2D) multiple 884992
_________________________________________________________________
max_pooling2d (MaxPooling2D) multiple 0
_________________________________________________________________
flatten (Flatten) multiple 0
_________________________________________________________________
dense (Dense) multiple 37752832
_________________________________________________________________
dense_1 (Dense) multiple 16781312
_________________________________________________________________
dense_2 (Dense) multiple 4097000
=================================================================
Total params: 62,378,344
Trainable params: 62,378,344
Non-trainable params: 0
_________________________________________________________________即使这是我的一个错误,我留在这里是为了理解目的。
https://datascience.stackexchange.com/questions/84492
复制相似问题