blocks|key|224955|text|是的，减少了空间维度(高度和宽度)：输入为16x16，H1为8x8，H2为4x4。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|224956|还请参阅体系结构部分的第一段：|224957|📷|atomic|offset|length|224958|来源|224959|在现代术语中，你会说他们使用了2的步幅，这就相应地减少了空间维度。|224960|编辑(根据您的评论)|224961|(方形)卷积层的空间输出维O公式如下：|style|CODE|224962|O+=+\frac{I+-+K+%2B+2P}S+%2B+1以I为输入大小，K为内核大小，P为填充，S为大步。现在，您可能会认为在您的示例O+=+\frac{16+-+5+%2B+2*2}2+%2B+1+=+8.5+(假设P=2)中|224963|但是仔细看看当层H1的5x5内核以2的步调扫描16x16输入映像时，它是如何运行的：|224964|224965|正如你可以从浅灰区域看到的那样，所需的有效填充实际上并不是所有的2块。相反，对于宽度或高度，一边是2，另一边是1，即平均(2%2B1)/2=1.5。|224966|如果你把它插入方程来计算输出大小，它给出的是：O+=+\frac{16+-+5+%2B+2*1.5}2+%2B+1+=+8。因此，卷积层H1将具有8x8的空间维数。|entityMap|0|IMAGE|mutability|IMMUTABLE|imageUrl|https://i.stack.imgur.com/Rct3v.jpg|imageAlt|1|LINK|MUTABLE|url|http://yann.lecun.com/exdb/publis/pdf/lecun-89e.pdf|2|INLINETEX|teX|O|3|O+=+\frac{I+-+K+%2B+2P}S+%2B+1|4|I|5|K|6|P|7|S|8|O+=+\frac{16+-+5+%2B+2*2}2+%2B+1+=+8.5|9|P=2|10|https://i.stack.imgur.com/dHSWz.png|11|(2%2B1)/2=1.5|12|O+=+\frac{16+-+5+%2B+2*1.5}2+%2B+1+=+8^0|0|0|0|1|0|0|0|2|1|0|0|0|D|1|D|1|2|0|0|Q|R|1|Y|1|15|1|1A|1|1T|Y|2V|3|0|Q|3|R|1|4|Y|1|5|15|1|6|1A|1|7|1T|Y|8|2V|3|9|0|0|0|1|A|0|1O|B|1O|B|B|0|N|Y|N|Y|C^^$0|@$1|2|3|4|5|6|7|22|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|23|8|@]|9|@]|A|$]]|$1|D|3|E|5|F|7|24|8|@]|9|@$G|25|H|26|1|27]]|A|$]]|$1|I|3|J|5|6|7|28|8|@]|9|@$G|29|H|2A|1|2B]]|A|$]]|$1|K|3|L|5|6|7|2C|8|@]|9|@]|A|$]]|$1|M|3|N|5|6|7|2D|8|@]|9|@]|A|$]]|$1|O|3|P|5|6|7|2E|8|@$G|2F|H|2G|Q|R]]|9|@$G|2H|H|2I|1|2J]]|A|$]]|$1|S|3|T|5|6|7|2K|8|@$G|2L|H|2M|Q|R]|$G|2N|H|2O|Q|R]|$G|2P|H|2Q|Q|R]|$G|2R|H|2S|Q|R]|$G|2T|H|2U|Q|R]|$G|2V|H|2W|Q|R]|$G|2X|H|2Y|Q|R]]|9|@$G|2Z|H|30|1|31]|$G|32|H|33|1|34]|$G|35|H|36|1|37]|$G|38|H|39|1|3A]|$G|3B|H|3C|1|3D]|$G|3E|H|3F|1|3G]|$G|3H|H|3I|1|3J]]|A|$]]|$1|U|3|V|5|6|7|3K|8|@]|9|@]|A|$]]|$1|W|3|E|5|F|7|3L|8|@]|9|@$G|3M|H|3N|1|3O]]|A|$]]|$1|X|3|Y|5|6|7|3P|8|@$G|3Q|H|3R|Q|R]]|9|@$G|3S|H|3T|1|3U]]|A|$]]|$1|Z|3|10|5|6|7|3V|8|@$G|3W|H|3X|Q|R]]|9|@$G|3Y|H|3Z|1|40]]|A|$]]]|11|$12|$5|13|14|15|A|$16|17|18|-4]]|19|$5|1A|14|1B|A|$1C|1D]]|1E|$5|1F|14|15|A|$1G|1H]]|1I|$5|1F|14|15|A|$1G|1J]]|1K|$5|1F|14|15|A|$1G|1L]]|1M|$5|1F|14|15|A|$1G|1N]]|1O|$5|1F|14|15|A|$1G|1P]]|1Q|$5|1F|14|15|A|$1G|1R]]|1S|$5|1F|14|15|A|$1G|1T]]|1U|$5|1F|14|15|A|$1G|1V]]|1W|$5|13|14|15|A|$16|1X|18|-4]]|1Y|$5|1F|14|15|A|$1G|1Z]]|20|$5|1F|14|15|A|$1G|21]]]]

Yes, the spatial dimensions (height and width) are reduced: the input is 16x16, H1 is 8x8 and H2 is 4x4.
Also see the first paragraph in the architecture section:
<img src="https://i.stack.imgur.com/Rct3v.jpg" alt="enter image description here" />
<a href="http://yann.lecun.com/exdb/publis/pdf/lecun-89e.pdf" rel="nofollow noreferrer">Source</a>
In modern terms you would say that they use a stride of 2. Which reduces the spatial dimensions accordingly.
EDIT (based on your comment)
The formula for the spatial output dimension $O$ of a (square shaped) convolutional layer is the following:
$$O = \frac{I - K + 2P}S + 1$$ with $I$ being the input size, $K$ being the kernel size, $P$ the padding and $S$ the stride. Now you might think that in your example $O = \frac{16 - 5 + 2*2}2 + 1 = 8.5$ (assuming $P=2$)
But take a closer look at how it actually plays out when the 5x5 kernel of layer H1 scans the 16x16 input image with a stride of 2:
<a href="https://i.stack.imgur.com/dHSWz.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/dHSWz.png" alt="enter image description here" /></a>
As you can see from the light grey area the required and effective padding is actually not 2 on all sides. Instead for the width or height respectively it is 2 on one side and 1 on the other side, i.e. on average $(2+1)/2=1.5$.
And if you plug that into the equation to calculate the output size it gives: $O = \frac{16 - 5 + 2*1.5}2 + 1 = 8$. Accordingly the convolutional layer H1 will have spatial dimensions of 8x8.

I am currently reading the paper proposed by LeCun et al. for handwritten zip code recognition. There is this figure below visualizing the CNN architecture. But I do not really understand how the connection between Layer H1 and input layer makes sense. If there are 12 kernels with size 5x5, shouldn't the layer H1 be 12x144? Or is there any downsampling taking place here too? 

<a href="https://i.stack.imgur.com/hroZG.jpg" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/hroZG.jpg" alt="CNN Architecture by LeCun et al."></a>

How to explain the connection between the input layer and H1 of this CNN Architecture?

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

教程

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云智能顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云AI代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

功能1上新10个字符

功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符。

功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符。

功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符

功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符

功能4上新

文章&问答评论现已支持表情

全新交互，全新视觉，新增快捷键、悬浮工具栏、高亮块等功能并同时优化现有功能，全面提升创作效率和体验

社区富文本编辑器全新改版！诚邀体验～ 

精选全网热门MCP server，让你的AI更好用 🚀

💥开发者 MCP广场重磅上线！

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

聚焦“写作效率、视觉美观与运行性能”三方面进行全面升级，为您提供更高效、稳定的创作环境

社区富文本&Markdown编辑器全新改版上线，欢迎大家体验!

诚挚邀请您参与本次调研，分享您的真实使用感受与建议。您的反馈至关重要，感谢您的支持与参与！

社区新版编辑器体验调研

我目前正在阅读LeCun等人提出的论文。手写体邮政编码识别。下面是CNN架构的形象图。但我并不真正理解层H1和输入层之间的连接是如何有意义的。如果有12个内核大小为5x5，那么层H1不应该是12x144吗？或者这里是否也发生了抽样下降的情况？📷

问如何解释这种CNN体系结构的输入层和H1之间的联系？
EN

回答 1

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何解释这种CNN体系结构的输入层和H1之间的联系？EN