Deformable Surface 3D Reconstruction from Monocular Images ab 63.99 € als Taschenbuch: . Aus dem Bereich: Bücher, Ratgeber, Computer & Internet,
This book presents a hardware architecture for the Simultaneous Localization And Mapping (SLAM) problem applied to embedded robots. The architecture is composed by highly specialized modules for robot localization and feature-based map building from images obtained directly from CMOS cameras in real time. The system is completely embedded on a Field-Programmable Gate Array (FPGA) device, where several hardware-orientated optimizations are exploited. The main modules of the architecture are the Extended Kalman Filter (EKF) and the feature detection system based on the SIFT (Scale Invariant Feature Transform) algorithm. Additionally, this book also presents basic concepts about mapping and state-of-the-art algorithms for SLAM with monocular and stereo vision.
Recovery of dense geometry and camera motion from a set of monocular images is a well-known problem that can be solved quite reliably in well-conditioned environments. Typical algorithms dealing with this problem assume static lighting and presence of sufficient scene texture. There are, however, many situations where these prerequisites are not met, and common algorithms fail. One example is medical video-endoscopy, where surfaces do not exhibit much texture, and lighting conditions change due to the moving light source that is mounted on the camera. We suggest to address the problem by applying a purely intensity-based approach that also takes into account changes in lighting conditions. In this thesis, we investigate the applicability of sliding window intensity-based bundle-adjustment methods to this problem.
Networked 3D virtual environments allow multiple users to interact with each other over the Internet. Users can share some sense of telepresence by remotely animating an avatar that represents them. However, avatar control may be tedious and still render user gestures poorly. This work aims at animating a user s avatar from real time 3D motion capture by monocular computer vision, thus allowing virtual telepresence to anyone using a personal computer with a webcam. The approach followed consists of registering a 3D articulated upper-body model to a video sequence. The first contribution of this work is a method of allocating computing iterations under real-time constrain that achieves optimal robustness and accuracy. The major issue for robust 3D tracking from monocular images is the 3D/2D ambiguities that result from the lack of depth information. As a second contribution, this work enhances particle filtering for 3D/2D registration under limited computation constrains with a number of heuristics, the contribution of which is demonstrated experimentally. A parameterization of the arm pose based on their end-effector is proposed to better model uncertainty in the depth direction.
Scene interpretation is a fundamental task in both computer vision and robotic systems. We deal with two important aspects of scene interpretation, they are scene reconstruction and scene recognition. Scene reconstruction is determining 3D positions of world points and retrieving camera poses from images. It has several applications such as virtual building editing, video augmentation, and planning and navigation in robotics. Among several approaches to modeling the scene, we deal with piecewise planar modeling due to several advantages. We propose a convex optimization based, approach for piecewise planar reconstruction. Scene recognition in robotics, specifically terrain scene recognition is one of the fundamental tasks of autonomous navigation. Navigable terrains are examples of planar scenes. It has applications in various domains such as advanced driver assistance systems, remote sensing, etc. Various sensing modalities such as ladars, lasers, accelerometers, stereo cameras, or combination of them are used in literature. We propose an algorithm which is purely based on a single monocular camera.
High Quality Content by WIKIPEDIA articles! Stereopsis (from stereo meaning solidity, and opsis meaning vision or sight) is the process in visual perception leading to the sensation of depth from the two slightly different projections of the world onto the retinas of the two eyes. The differences in the two retinal images are called horizontal disparity, retinal disparity, or binocular disparity. The differences arise from the eyes' different positions in the head. Stereopsis is commonly referred to as depth perception. This is inaccurate, as depth perception relies on many more monocular cues than stereoptical ones, and individuals with only one functional eye still have full depth perception except in artificial cases (such as stereoscopic images) where only binocular cues are present.
This accessible textbook presents an introduction to computer vision algorithms for industrially-relevant applications of X-ray testing. Features: introduces the mathematical background for monocular and multiple view geometry, describes the main techniques for image processing used in X-ray testing, presents a range of different representations for X-ray images, explaining how these enable new features to be extracted from the original image, examines a range of known X-ray image classifiers and classification strategies, discusses some basic concepts for the simulation of X-ray images and presents simple geometric and imaging models that can be used in the simulation, reviews a variety of applications for X-ray testing, from industrial inspection and baggage screening to the quality control of natural products, provides supporting material at an associated website, including a database of X-ray images and a Matlab toolbox for use with the book's many examples.
This monograph introduces novel responses to the different problems that arise when multiple robots need to execute a task in cooperation, each robot in the team having a monocular camera as its primary input sensor. Its central proposition is that a consistent perception of the world is crucial for the good development of any multi-robot application. The text focuses on the high-level problem of cooperative perception by a multi-robot system: the idea that, depending on what each robot sees and its current situation, it will need to communicate these things to its fellows whenever possible to share what it has found and keep updated by them in its turn. However, in any realistic scenario, distributed solutions to this problem are not trivial and need to be addressed from as many angles as possible.Distributed Consensus with Visual Perception in Multi-Robot Systems covers a variety of related topics such as:- distributed consensus algorithms,- data association and robustness problems,- convergence speed, and- cooperative mapping.The book first puts forward algorithmic solutions to these problems and then supports them with empirical validations working with real images. It provides the reader with a deeper understanding of the problems associated to the perception of the world by a team of cooperating robots with onboard cameras.Academic researchers and graduate students working with multi-robot systems, or investigating problems of distributed control or computer vision and cooperative perception will find this book of material assistance with their studies.
This book proposes a complete pipeline for monocular (single camera) based 3D mapping of terrestrial and underwater environments. The aim is to provide a solution to large-scale scene modeling that is both accurate and efficient. To this end, we have developed a novel Structure from Motion algorithm that increases mapping accuracy by registering camera views directly with the maps. The camera registration uses a dual approach that adapts to the type of environment being mapped.In order to further increase the accuracy of the resulting maps, a new method is presented, allowing detection of images corresponding to the same scene region (crossovers). Crossovers then used in conjunction with global alignment methods in order to highly reduce estimation errors, especially when mapping large areas. Our method is based on Visual Bag of Words paradigm (BoW), offering a more efficient and simpler solution by eliminating the training stage, generally required by state of the art BoW algorithms.Also, towards developing methods for efficient mapping of large areas (especially with costs related to map storage, transmission and rendering in mind), an online 3D model simplification algorithm is proposed. This new algorithm presents the advantage of selecting only those vertices that are geometrically representative for the scene.