文章/答案/技术大牛

发布

社区首页 >问答首页 >“R变体中使用的BB回归算法”与“YOLO中的BB”定位技术有什么区别？

问“R变体中使用的BB回归算法”与“YOLO中的BB”定位技术有什么区别？
EN

Stack Overflow用户

提问于 2019-04-05 00:16:06

回答 1查看 404关注 0票数 1

问题：

“基于区域的对象检测器中的BB回归算法”与“单镜头检测器中的边界盒”产生的边界盒(BB)有什么区别？如果不是为什么，它们可以互换使用吗？

在理解niche和Yolo目标检测算法变体的同时，我遇到了两种主要的目标检测技术，即基于区域的(Niche)和基于小生境滑动窗口(YOLO)的技术。

这两个系统都使用不同的变体(复杂到简单)，但最终，它们只是使用边界框在图像中定位对象！我只是想专注于本土化(假设分类正在发生！)以下是与所提问题更相关的问题，并简要解释了我的理解：

基于区域的

- Here, we let the Neural network to predict continuous variables(BB coordinates) and refers to that as regression. 
- The regression that is defined (which is not linear at all), is just a CNN or other variants(all layers were differentiable),outputs are four values (,,ℎ,), where (,) specify the values of the position of the left corner and (ℎ,) the height and width of the BB.
- In order to train this NN, a smooth L1 loss was used to learn the precise BB by penalizing when the outputs of the NN are very different from the labeled (,,ℎ,) in the training set!

小众-滑动窗口(卷积实现！)基于

- first, we divide the image into say 19\*19 grid cells.
- the way you assign an object to a grid-cell is by selecting the midpoint of an object and then assigning that object to whichever one grid cell contains the midpoint of the object. So each object, even if the objects span multiple grid cells, that object is assigned only to one of the 19 by 19 grid cells.
- Now, you take the two coordinates of this grid-cell and calculate the precise BB(bx, by, bh, bw) for that object using some method such as 
- (bx, by, bh, bw) are relative to the grid cell where x & y are center point and h & w are the height of precise BB i.e the height of the bounding box is specified as a fraction of the overall width of the grid cell and h& w can be >1. 
- There multiple ways of calculating precise BB specified in the paper.

这两种算法：

输出精确的包围盒！
在监督学习设置中，他们使用标记数据集，其中标签是存储在JSON/XML文件格式中的每个图像的包围盒(使用像拉布雷格这样的工具手动标记我的一些注解器)。

我试图在一个更抽象的层次上理解这两种本地化技术(同时对这两种技术都有一个深入的想法！)为了更清楚地了解：

在什么意义上他们是不同的？&
为何产生2，我的意思是1在另一个上的失败/成功点是什么？
它们是否可以互换使用，如果不能，那为什么？

请随时纠正我，如果我是错误的地方，反馈是高度赞赏！引用一篇研究论文中的任何一节都会更有价值！

conv-neural-network

object-detection

yolo

faster-rcnn

computer-vision

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-04-07 10:59:31

本质上的区别在于，两级速度更快，R-CNN类更准确，而单级YOLO/SSD类更快。

在两阶段结构中，第一阶段通常是区域建议，第二阶段是进行分类和更精确的定位。您可以认为第一阶段类似于单级体系结构，其区别在于区域提案只将“对象”与“背景”分开，而单一阶段则区分所有对象类。更明确的是，在第一阶段，也是以滑动窗口的方式，RPN表示是否存在对象，如果存在，则大致给出它所在的区域(包围框)。第二阶段使用该区域进行分类和包围盒回归(以便更好地定位)，方法是首先汇集所提议区域的相关特性，然后通过快速R-CNN类体系结构(做classificaion+regression)。

关于你关于在他们之间交流的问题--你为什么要这样做？通常，您会根据最迫切的需求(例如延迟/能力/准确性)来选择体系结构，并且您不会希望在它们之间进行交换，除非有一些复杂的想法会对您有所帮助。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/55526607

复制

相似问题

问“R变体中使用的BB回归算法”与“YOLO中的BB”定位技术有什么区别？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问“R变体中使用的BB回归算法”与“YOLO中的BB”定位技术有什么区别？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问“R变体中使用的BB回归算法”与“YOLO中的BB”定位技术有什么区别？
EN