This manuscript addresses the problem of obstacle avoidance for semi- and autonomous terrestrial platforms in dynamic and unknown environments. Based on monocular vision, it proposes a set of tools that continuously monitors the way forward, proving appropriate road information in real time. Taking into account the temporal coherence between consecutive frames, a new Dynamic Power Management methodology is proposed and applied to a robotic visual machine perception, which included a new environment observer method to optimize energy consumption used by a visual machine. A remarkable characteristic of these methodologies is its independence of the image acquiring system and of the robot itself. This real-time perception system has been evaluated from different test-banks and also from real data obtained by two intelligent platforms. In semi-autonomous tasks, tests were conducted at speeds above 100 Km/h. Autonomous displacements were also carried out successfully.
Recovery of dense geometry and camera motion from a set of monocular images is a well-known problem that can be solved quite reliably in well-conditioned environments. Typical algorithms dealing with this problem assume static lighting and presence of sufficient scene texture. There are, however, many situations where these prerequisites are not met, and common algorithms fail. One example is medical video-endoscopy, where surfaces do not exhibit much texture, and lighting conditions change due to the moving light source that is mounted on the camera. We suggest to address the problem by applying a purely intensity-based approach that also takes into account changes in lighting conditions. In this thesis, we investigate the applicability of sliding window intensity-based bundle-adjustment methods to this problem.
Networked 3D virtual environments allow multiple users to interact with each other over the Internet. Users can share some sense of telepresence by remotely animating an avatar that represents them. However, avatar control may be tedious and still render user gestures poorly. This work aims at animating a user s avatar from real time 3D motion capture by monocular computer vision, thus allowing virtual telepresence to anyone using a personal computer with a webcam. The approach followed consists of registering a 3D articulated upper-body model to a video sequence. The first contribution of this work is a method of allocating computing iterations under real-time constrain that achieves optimal robustness and accuracy. The major issue for robust 3D tracking from monocular images is the 3D/2D ambiguities that result from the lack of depth information. As a second contribution, this work enhances particle filtering for 3D/2D registration under limited computation constrains with a number of heuristics, the contribution of which is demonstrated experimentally. A parameterization of the arm pose based on their end-effector is proposed to better model uncertainty in the depth direction.
Vision based inference for mobile robots in order to avoid obstacles, is one of the most attracted areas for both domains of Computer Vision and Robotics. Computer Vision, more specifically the vision for intelligent machines enables mobile robots to perceive the external world with 'wisdom'. Therefore Vision based obstacle avoidance has become one of the major research areas of Robotics. Estimating the motion path and predicting the motion behavior of a dynamic object with only single camera (monocular vision) is a real challenge. We realized this can be done by analyzing a sequence of image frames extracted from a live video stream. But, these analytic techniques must be extremely fast in real time processing, since the decision drawn within reasonably short response time is the only 'god' to safeguard the robot & ensure the safety of others in environment! Therefore, we postulate a fuzzy-mathematical model: an Artificial Intelligence approach to achieve the ultimate objective, which has a significant impact in terms of simplicity (reduced complexity) together with efficiency (minimized computational overhead: resource consumption), rather than conventional mathematical modeling.
Scene interpretation is a fundamental task in both computer vision and robotic systems. We deal with two important aspects of scene interpretation, they are scene reconstruction and scene recognition. Scene reconstruction is determining 3D positions of world points and retrieving camera poses from images. It has several applications such as virtual building editing, video augmentation, and planning and navigation in robotics. Among several approaches to modeling the scene, we deal with piecewise planar modeling due to several advantages. We propose a convex optimization based, approach for piecewise planar reconstruction. Scene recognition in robotics, specifically terrain scene recognition is one of the fundamental tasks of autonomous navigation. Navigable terrains are examples of planar scenes. It has applications in various domains such as advanced driver assistance systems, remote sensing, etc. Various sensing modalities such as ladars, lasers, accelerometers, stereo cameras, or combination of them are used in literature. We propose an algorithm which is purely based on a single monocular camera.
Developing on-board driver assistance systems (DAS) requires understanding of various events involving the motions of the vehicles in the vicinity of the host vehicle. Determining the position of other vehicles on the road is a key information to help driver assistance systems. Thus, robust and reliable vehicle detection and tracking are the basic steps in these systems. Since monocular vision based systems are particularly interesting for the advantage of reducing costs and maintenance and for the high fidelity information they give about the driving environment, the problem can be addressed by applying computer vision techniques. This work has mainly been focused on detecting and tracking vehicles viewed from inside a vehicle the camera mounted in daylight conditions. The approach presented in the book uses vehicle shadow clues and vehicle edge information to obtain cost effective and fast estimation. After extracting vehicles, the algorithm effectively track them using a Kalman filter based tracking algorithm. Several sequences from real traffic situations have been tested, obtaining highly accurate multiple vehicle detections.
Monocular vision is vision in which each eye is used separately. By using the eyes in this way, as opposed by binocular vision, the field of view is increased, while depth perception is limited. The eyes are usually positioned on opposite sides of the animal's head giving it the ability to see two objects at once. The word monocular comes from the Greek root, mono for one, and the Latin root, oculus for eye. Most birds and lizards (except chameleons) have monocular vision. Owls and other birds of prey are notable exceptions. Also many prey have monocular vision to see predators.
High Quality Content by WIKIPEDIA articles! Stereopsis (from stereo meaning solidity, and opsis meaning vision or sight) is the process in visual perception leading to the sensation of depth from the two slightly different projections of the world onto the retinas of the two eyes. The differences in the two retinal images are called horizontal disparity, retinal disparity, or binocular disparity. The differences arise from the eyes' different positions in the head. Stereopsis is commonly referred to as depth perception. This is inaccurate, as depth perception relies on many more monocular cues than stereoptical ones, and individuals with only one functional eye still have full depth perception except in artificial cases (such as stereoscopic images) where only binocular cues are present.
High Quality Content by WIKIPEDIA articles! The visual system is the part of the central nervous system which enables organisms to see. It interprets the information from visible light to build a representation of the world surrounding the body. The visual system accomplishes a number of complex tasks, including the reception of light and the formation of monocular representations, the construction of a binocular perception from a pair of two dimensional projections, the identification and categorization of visual objects, assessing distances to and between objects, and guiding body movements to visual objects. The psychological manifestation of visual information is known as visual perception.