英语原文共 8 页,剩余内容已隐藏,支付完成后下载完整资料
IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 3, NO. 4, OCTOBER 2018
DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes
动态SLAM(同步定位和制图):动态场景中的追踪、地图构建和绘制地图
Abstract—The assumption of scene rigidity is typical in SLAM algorithms. Such a strong assumption limits the use of most visual SLAM systems in populated real-world environments, which are the target of several relevant applications like service robotics or autonomous vehicles. In this letter we present DynaSLAM, a visual
SLAM system that, building on ORB-SLAM2, adds the capabilities of dynamic object detection and background inpainting. DynaSLAM is robust in dynamic scenarios for monocular, stereo, and RGB-D configurations. We are capable of detecting the moving objects either by multiview geometry, deep learning, or both. Having a static map of the scene allows inpainting the frame background that has been occluded by such dynamic objects. We evaluate our system in public monocular, stereo, and RGB-D datasets. We study the impact of several accuracy/speed trade-offs to assess the limits of the proposed methodology. DynaSLAM outperforms the accuracy of standard visual SLAM baselines in highly dynamic scenarios. And it also estimates a map of the static parts of the scene, which is a must for long-term applications in real-world environments.
Index Terms—SLAM, visual-based navigation, localization.
场景刚性假设是SLAM中典型的算法。这种强烈的假设限制了大多数视觉SLAM系统在真实世界环境中的应用,而这一应用是几个相关应用的目标,如服务机器人或自动驾驶车辆。在本文中,我们展示了动态SLAM,它是在ORB-SLAM2基础上添加了动态物体探测功能和背景去瑕疵功能的视觉SLAM系统。在单目、双目和RGB-D配置下的动态场景中,DynaSLAM具有鲁棒性。我们能够通过多 视图几何学,深度学习,或共同使用来探测移动的物体。绘制出静态场景的地图,将已被动态物体遮挡住的框架背景绘制出来。我们在公共单目、双目和RGB-D数据集中评估我们的系统。我们研究了几种精度/速度权衡的影响来评估提出方法的限制。在高度动态的场景中,动态SLAM在准确性上更胜一筹标准的视觉SLAM基线。它也估计出场景中静态部分的地图,这对于现实环境中的长期应是必需的。
索引术语- SLAM,基于视觉的导航,定位。
I. INTRODUCTION
Simultaneous Localization and Mapping (SLAM) is a prerequisite for many robotic applications, for example collision-less navigation. SLAM techniques estimate jointly a map of an unknown environment and the robot pose within such map, only from the data streams of its on-board sensors. The map allows the robot to continually localize within the same environment without accumulating drift. This is in contrast to odometry approaches that integrate the incremental motion estimated within a local window and are unable to correct the drift when revisiting places.
同步定位和映射(SLAM)是许多机器人应用的先决条件,例如无碰撞导航。SLAM技术仅从其车载传感器的数据流,就能联合评估未知环境的地图和机器人在地图中的姿态。该地图允许机器人在相同的环境中不断地定位,而不会积累漂移。这与里程数测量方法形成了对比,后者整合了在局部窗口内估计的增量运动,并且在重新访问位置时无法纠正偏移。
Visual SLAM, where the main sensor is a camera, has received a high degree of attention and research efforts over the last years. The minimalistic solution of a monocular camera has practical advantages with respect to size, power and cost, but also several challenges such as the unobservability of the scale or state initialization. By using more complex setups, like stereo or RGB-D cameras, these issues are solved and the robustness of visual SLAM systems can be greatly improved.
视觉SLAM的主要传感器是一个摄像头,在过去的几年里得到了高度的关注和研究。单眼相机的最小化解决方案在尺寸、功率和成本方面具有实际的优势,但也存在一些挑战,如规模或状态初始化的不可观测性。通过使用更复杂的设置,如立体声或RGB-D摄像机,这些问题得到了解决,并可大大提高视觉SLAM系统的鲁棒性。
The research community has addressed SLAM from many different angles. However, the vast majority of the approaches and datasets assume a static environment. As a consequence, they can only manage small fractions of dynamic content by classifying them as outliers to such static model. Although the static assumption holds for some robotic applications, it limits the applicability of visual SLAM in many relevant cases, such as intelligent autonomous systems operating in populated realworld environments over long periods of time.
研究界已经从许多不同的角度讨论了SLAM。然而,绝大多数方法和数据集都假定是静态环境。因此,它们只能通过将动态内容分类为此类静态模型的异常值来管理一小部分动态内容。虽然静态假设适用于一些机器人应用程序,但它限制了视觉 SLAM在许多相关情况下的适用性,比如在人口密集的真实世界环境中长时间运行的智能自主系统。
Visual SLAM can be classified into feature-based methods [2], [3], that rely on salient points matching and can only estimate a sparse reconstruction; and direct methods [4]–[6], whichare able to estimate in principle a completely dense reconstruction by the direct minimization of the photometric error and TV regularization. Some direct methods focus on the high-gradient areas estimating semi-dense maps [7], [8].
视觉SLAM可分为基于特征的[2]、[3]方法,这些方法依赖于凸点匹配,只能估计稀疏重构;而直接方法[4]-[6]通过直接最小化光度误差和电视正规化,在原理上可以估计出一个完全密集的重建图像。一些直接的方法侧重于高梯度区域的半密度图[7]、[8]的估计。
None of the above methods, considered the state of the art, address the very common problem of dynamic objects in the scene, e.g., people walking, bicycles or cars. Detecting and dealing with dynamic objects in visual SLAM reveals several challenges for both mapping and tracking, including:
1) How to detect such dynamic objects in the images to:
a) Prevent the tracking algorithm from using matches that belong to dynamic objects.
b) Prevent the mapping algorithm from including moving objects as part of the 3D map.
2) How to complete the part of the 3D map that is temporally occluded by a moving object.
以上的方法,考虑到目前的技术水平,都不能解决场景中动态对象的常见问题,例如行人、自行车或汽车。在视觉 SLAM中检测和处理动态对象揭示了映射和跟踪的几个挑战,包括:
1)如何在图像中检测这样的动态对象:
a)防止跟踪算法使用属于动态对象的匹配。
b)防止映射算法将移动对象作为3D地图的一部分。
2)如何完成被运动物体临时遮挡的三维地图部分。
Many applications would greatly benefit from progress along these lines. Among others, augmented reality, autonomous vehicles, and medical imaging. All of them could for instance safely reuse maps from previous runs. Detecting and dealing with dynamic objects is a requisite to estimate stable maps, useful for long-term applications. If the dynamic content is not detected, it becomes part of the 3D map, complicating its usability for tracking or relocation purposes.
许多应用程序将极大地受益于这些方面的进展。其中包括增强现实、自动驾驶汽车和医学成像。例如,它们都可以安全地重用以前运行的映射。检测和处理动态对象是估计稳定映射的必要条件,对长期应用非常有用。如果没有检测到动态内容,它就会成为3D地图的一部分,从而增加了跟踪或重新定位的可用性。
In this work we propose an on-line algorithm to deal
剩余内容已隐藏,支付完成后下载完整资料
资料编号:[239957],资料为PDF文档或Word文档,PDF文档可免费转换为Word
以上是毕业论文外文翻译,课题毕业论文、任务书、文献综述、开题报告、程序设计、图纸设计等资料可联系客服协助查找。