blocks|key|2663440|text|是的，有很多尝试，但也许最值得注意的是在安德烈杰·卡萨帕和李斐-费的论文中描述的方法，他们将一个CNN和RNN串联在一起(CNN在图像区域%2B双向RNN+%2B多模式RNN)，并使用这个方法标记一个完整的句子。不过，这不仅仅是目标检测，因为它利用一组场景及其描述来生成新的未见图像的自然语言描述。|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|2663441|另一个例子是明梁与胡小林的称谓，他们将CNN和RNN混为一谈，并使用这种架构来更好地检测对象。正如明和小林在他们的论文中所解释的(链接在上面)，RNN被用来改进CNN：|2663442|一个显著的区别是CNN通常是一种前馈架构，而在视觉系统中，循环连接非常丰富。受到这一事实的启发，我们提出了一个递归CNN+(RCNN)，通过在每个卷积层中加入递归连接来识别对象。|blockquote|entityMap|0|LINK|mutability|MUTABLE|url|https://arxiv.org/abs/1412.2306|1|http://ieeexplore.ieee.org/document/7298958/^0|K|D|0|0|6|9|1|0^^$0|@$1|2|3|4|5|6|7|R|8|@]|9|@$A|S|B|T|1|U]]|C|$]]|$1|D|3|E|5|6|7|V|8|@]|9|@$A|W|B|X|1|Y]]|C|$]]|$1|F|3|G|5|H|7|Z|8|@]|9|@]|C|$]]]|I|$J|$5|K|L|M|C|$N|O]]|P|$5|K|L|M|C|$N|Q]]]]

Yes, there have been many attempts, but perhaps the most noteable one is the approach described in the paper of <a href="https://arxiv.org/abs/1412.2306" rel="noreferrer">Andrej Karpathy and Li Fei-Fei</a> where they connect a CNN and RNN in series (CNN over image region + bidirectional RNN + Multimodal RNN) and use this for labeling a scene with a whole sentence. Though, this one is more than just object detection as it leverages a data set of scenes and their descriptions to generate natural language descriptions of new unseen images.

Another example is <a href="http://ieeexplore.ieee.org/document/7298958/" rel="noreferrer">Ming Liang and Xiaolin Hu's approche</a> where they mix a CNN with an RNN and use this architecture for better object detection. As Ming and Xiaolin explained in their paper (linked above), the RNN is used to improve the CNN:

<blockquote>
 A prominent difference is that CNN is typically a feed-forward architecture while in the visual system recurrent connections are abundant. Inspired by this fact, we propose a recurrent CNN (RCNN) for object recognition by incorporating recurrent connections into each convolutional layer.
</blockquote>

blocks|key|2656774|text|实际上，我不认为使用RNN进行目标检测是一种很好的方法，因为与CNN相比，RNN中没有“接受场”的概念，这应该是做视觉相关任务的一个关键点。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|entityMap^0^^$0|@$1|2|3|4|5|6|7|C|8|@]|9|@]|A|$]]]|B|$]]

Actually, I do not think it should be a good way of using RNN only to do object detection work, because there is no "Receptive Field" conception in RNN compared with CNN, which I think should be a key point in doing vision related task.

blocks|key|2656861|text|递归神经网络(+RNN+)是序列数据的最新算法，而长时记忆(LSTM)网络是RNN的扩展。该方法可用于视频或运动图像中的目标检测，可以尝试此https://github.com/tensorflow/models/tree/master/research/lstm_对象_检测。本文提出了一种基于Tensorflow的移动视频对象检测实现方案:基于时间感知特征图的移动视频对象检测(CVPR+2018)。纸的链接：http://openaccess.thecvf.com/content_cvpr_2018/文件/刘_莫比尔县_视频_对象_CVPR_2018年_paper.pdf|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|entityMap|0|LINK|mutability|MUTABLE|url|https://github.com/tensorflow/models/tree/master/research/lstm_object_detection|1|http://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_Mobile_Video_Object_CVPR_2018_paper.pdf^0|1Y|1W|0|5R|2B|1^^$0|@$1|2|3|4|5|6|7|M|8|@]|9|@$A|N|B|O|1|P]|$A|Q|B|R|1|S]]|C|$]]]|D|$E|$5|F|G|H|C|$I|J]]|K|$5|F|G|H|C|$I|L]]]]

Recurrent Neural Networks (RNN) are the state of the art algorithm for sequential data and Long Short-Term Memory (LSTM) networks are an extension for RNN. This method can be used on object detection in case detect object in video or moving images, etc. You can try this <a href="https://github.com/tensorflow/models/tree/master/research/lstm_object_detection" rel="nofollow noreferrer">https://github.com/tensorflow/models/tree/master/research/lstm_object_detection</a>. It implementation from Tensorflow mobile video object detection implementation proposed in the following paper: Mobile Video Object Detection with Temporally-Aware Feature Maps (CVPR 2018). The link of paper: <a href="http://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_Mobile_Video_Object_CVPR_2018_paper.pdf" rel="nofollow noreferrer">http://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_Mobile_Video_Object_CVPR_2018_paper.pdf</a>

blocks|key|2657063|text|是的，有很多关于使用RNNs的目标检测的文献，它通常包括视频或动作检测中的目标检测和跟踪。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|2657064|一些论文：|2657065|“利用联想LSTM在线视频对象检测”，2018年，Lu等。|unordered-list-item|offset|length|2657066|"Re3+:用于通用对象视觉跟踪的实时回归回归网络“，2017年年，Gordon等。|2657067|“利用递归神经网络在线多目标跟踪”，2016年年，米兰等。|2657068|“使用带有CNN特征的深度双向LSTM的视频序列中的动作识别”，2018年，Ullah等。|entityMap|0|LINK|mutability|MUTABLE|url|http://openaccess.thecvf.com/content_ICCV_2017/papers/Lu__Online_Video_ICCV_2017_paper.pdf|1|http://arxiv.org/abs/1705.06368|2|http://arxiv.org/abs/1604.03635|3|https://ieeexplore.ieee.org/document/8121994^0|0|0|0|T|0|0|0|16|1|0|0|T|2|0|0|19|3^^$0|@$1|2|3|4|5|6|7|11|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|12|8|@]|9|@]|A|$]]|$1|D|3|E|5|F|7|13|8|@]|9|@$G|14|H|15|1|16]]|A|$]]|$1|I|3|J|5|F|7|17|8|@]|9|@$G|18|H|19|1|1A]]|A|$]]|$1|K|3|L|5|F|7|1B|8|@]|9|@$G|1C|H|1D|1|1E]]|A|$]]|$1|M|3|N|5|F|7|1F|8|@]|9|@$G|1G|H|1H|1|1I]]|A|$]]]|O|$P|$5|Q|R|S|A|$T|U]]|V|$5|Q|R|S|A|$T|W]]|X|$5|Q|R|S|A|$T|Y]]|Z|$5|Q|R|S|A|$T|10]]]]

Yes there is a lot of literature about object detection using RNNs and it often consists of object detection and tracking in videos or action detection.

Some papers:

<ul>
<li><a href="http://openaccess.thecvf.com/content_ICCV_2017/papers/Lu__Online_Video_ICCV_2017_paper.pdf" rel="nofollow noreferrer">"Online Video Object Detection Using Association LSTM", 2018, Lu et al.</a></li>
<li><a href="http://arxiv.org/abs/1705.06368" rel="nofollow noreferrer">"Re3 : Real-Time Recurrent Regression Networks for Visual Tracking of Generic Objects", 2017, Gordon et al.</a></li>
<li><a href="http://arxiv.org/abs/1604.03635" rel="nofollow noreferrer">"Online Multi-Target Tracking Using Recurrent Neural Networks", 2016, Milan et al.</a></li>
<li><a href="https://ieeexplore.ieee.org/document/8121994" rel="nofollow noreferrer">"Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features", 2018, Ullah et al.</a></li>
</ul>

after reading the state of the art about object detection using CNN (R-CNN Faster R-CNN ,YOLO, SSD...) I was wondering if there is a method that use RNN's or that combine the use of CNN's and RNN's for object detection ??
Thank you

Is There any RNN method used for Object detection

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

教程

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云智能顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云AI代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

功能1上新10个字符

功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符。

功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符。

功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符

功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符

功能4上新

文章&问答评论现已支持表情

全新交互，全新视觉，新增快捷键、悬浮工具栏、高亮块等功能并同时优化现有功能，全面提升创作效率和体验

社区富文本编辑器全新改版！诚邀体验～ 

精选全网热门MCP server，让你的AI更好用 🚀

💥开发者 MCP广场重磅上线！

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

聚焦“写作效率、视觉美观与运行性能”三方面进行全面升级，为您提供更高效、稳定的创作环境

社区富文本&Markdown编辑器全新改版上线，欢迎大家体验!

诚挚邀请您参与本次调研，分享您的真实使用感受与建议。您的反馈至关重要，感谢您的支持与参与！

社区新版编辑器体验调研

在阅读了使用CNN检测物体的最新进展(R快速R，YOLO，SSD.)我想知道是否有一种使用RNN的方法，还是将CNN和RNN的方法结合起来进行目标检测？谢谢

问是否有用于目标检测的RNN方法？
EN

回答 4

Data Science用户

Data Science用户

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问是否有用于目标检测的RNN方法？EN

回答 4

Data Science用户

Data Science用户

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问是否有用于目标检测的RNN方法？
EN