Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2000, Pattern Analysis and …
AI
Pose estimation is a key challenge across multiple disciplines, relying fundamentally on the relationship between 3D reference points and their corresponding 2D projections. Traditional methods, like the Gauss-Newton approach, are limited by their dependence on initial conditions and can struggle with convergence. This paper introduces a novel method for pose estimation that addresses these limitations by enabling fast, globally convergent solutions, demonstrating superior performance in terms of translation and rotation accuracy through extensive experimental comparisons against established methods.
IEEE Transactions on Cybernetics, 2014
This paper deals with pose estimation using an iterative scheme. We show that using adequate visual information, pose estimation can be performed iteratively with only three independent unknowns, which are the translation parameters. Specifically, an invariant to rotational motion is used to estimate the camera position. In addition, an adequate transformation is applied to the proposed invariant to decrease the non-linearities between the variations in image space and 3D space. Once the camera position is estimated, we show that the rotation can be estimated efficiently using two different direct methods. The proposed approach is compared against two other methods from the literature. The results show that using our method, pose tracking in image sequences and the convergence rate for randomly generated poses are improved.
1993 (4th) International Conference on Computer Vision
In this paper W E present a method for robustly and accurately estimating the rotation and translation between a camera and a 3-D object f r o m point and line correspondences. First we devise an error function and second we show how to minimize this error function. The quadratic nature of this function is made possible by representing rotation and translation with a dual number quaternion. W e provide a detailed account of the computational aspecis of a trust-region optimization method. This method compares favourably with Newton's method which has extensively been used t o solve the problem at hand, with Faugeras-Toscani's linear method [3] for calibrating a camera. Finally we present some experimental results which demonstrate the robustness of our method with respect to image noise and matching errors.
Pattern Recognition Letters, 2004
The goal of this paper is twofold: firstly, we propose a novel interpretation for collinearity in the process of camera pose estimation from given correspondences between a 3D model and its 2D projective image. In contrast with the existing interpretations for collinearity, the focus of expansion (FOE) theory is a special case of our novel interpretation for collinearity and besides the projection of camera position on the image plane, every image point can become a FOE. Secondly, we propose a novel method, based on the collinearity equation, for camera pose estimation from given point correspondences between a 3D model and its projective image. A comparative study based on both synthetic data and real images has shown that the novel algorithm is promising.
Machine Vision for Inspection and Measurement, 1989
Solutions for four different pose estimation problems are presented. Closed form least-squares solutions are given to the over constrained ZD-ZD and 3-D-3-D pose estimation problems. A globally convergent iterative technique is given for the 2-D perspective projection-3-D pose estimation problem. A simplified linear solution and a robust solution to the 2-D perspective projection-ZD perspective projection pose estimation problem are also given. Simulation experiments consisting of millions of hials having varying numbers of pairs of corresponding points, varying signal to noise ratio (SNR) with either Gaussian or uniform noise provide data suggesting that accurate inference of rotation and translation with noisy data may require corresponding point data sets having hundreds of corresponding point pairs when the SNR is less than 40 dB. The experiment results also show that robust technique can suppress the effect of blunder data that come from outliers or mismatched points.
Proceedings in applied mathematics & mechanics, 2015
Imagine that hundreds of video streams, taken by mobile phones during a rock concert, are uploaded to a server. One attractive application of such prominent dataset is to allow a user to create his own video with a deliberately chosen but virtual camera trajectory. In this paper we present algorithms for the main sub-tasks (spatial calibration, image interpolation) related to this problem. Calibration: Spatial calibration of individual video streams is one of the most basic tasks related to creating such a video. At its core, this requires to estimate the pairwise relative geometry of images taken by different cameras. It is also known as the relative pose problem [1], and is fundamental to many computer vision algorithms. In practice, efficiency and robustness are of highest relevance for big data applications such as the ones addressed in the EU-FET_SME project Sce-neNet. In this paper, we present an improved algorithm that exploits additional data from inertial sensors, such as accelerometer, magnetometer or gyroscopes, which by now are available in most mobile phones. Experimental results on synthetic and real data demonstrate the accuracy and efficiency of our algorithm. Interpolation: Given the calibrated cameras, we present a second algorithm that generates novel synthetic images along a predefined specific camera trajectory. Each frame is produced from two "neighboring" video streams that are selected from the data base. The interpolation algorithm is then based on the point cloud reconstructed in the spatial calibration phase and iteratively projects triangular patches from the existing images into the new view. We present convincing images synthesized with the proposed algorithm.
2018
This paper presents a method for pose estimation of a rigid body using unit dual quaternions where pose measurements from point clouds are filtered with a multiplicative extended Kalman filter (MEKF). The point clouds come from a 3D camera fixed to the moving rigid body, and then consecutive point clouds are aligned with the Iterative Closest Point (ICP) algorithm to obtain pose measurements. The unit constraint of the dual quaternion is ensured in the filtering process with the Dual Quaternion MEKF (DQ-MEKF), where the measurement updates are performed using the dual quaternion product so that the result is a unit dual quaternion. In addition, we use the Cayley transform for the discrete time propagation of the DQ-MEKF estimate, eliminating the need for normalization and projection of the resulting unit dual quaternion. The ICP algorithm is initialized with the time propagated state of the filter to give faster and more accurate pose measurements. To further improve the accuracy of the initialization, angular velocity measurements from a gyroscope fixed to the camera are included in the filter. The proposed method has been tested in experiments using a Kinect v2 3D camera mounted rigidly on a KUKA KR6 robotic arm, while a customized ICP algorithm was successfully implemented on a Graphical Processing Unit (GPU) system.
2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013
This paper deals with pose estimation using an iterative scheme. We show that using adequate visual information, pose estimation can be performed in decoupled way. More precisely, we show that pose estimation can be achieved iteratively with only three independent unknowns, which are the translation parameters. Specifically, an invariant to rotational motion is used to estimate the camera position. Once the camera position is estimated, we show that the rotation can be estimated efficiently using a direct method. The proposed approach is compared against two other methods from the literature. The results show that using our method, pose tracking in image sequences and the convergence rate for randomly generated poses are improved.
Real-Time Imaging, 1999
The problem of object pose from 2D to 3D correspondences has received a lot of attention both in the photogrammetry and computer vision literatures. Various approaches to the object pose (or external camera parameters) problem fall into two distinct categories: closed-form solutions and non-linear solutions. Closed-form solutions may be applied only to a limited number of correspondences . Whenever the number of correspondences is larger than four, closed-form solutions are not efficient and iterative non-linear solutions are necessary . The latter approaches have two drawbacks: (i) they need a good initial estimate of the true solution, and (ii) they are time con-suming. Therefore, such approaches can not be used in tasks that require high speed performance (visual servoing, object tracking, …) . To our knowledge, the method proposed by DeMenthon and Davis [9] is among the first attempts to use linear techniques, associated with the weak perspective camera model in order to obtain the pose that is associated with the perspective camera model. The method starts with computing the object pose using a weak perspective model and after a few iterations converges towards a pose estimated under perspective.
Journal of the Optical Society of America A, 1988
Finding the relationship between two coordinate systems using pairs of measurements of the coordinates of a number of points in both systems is a classic photogrammetric task. It finds applications in stereophotogrammetry and in robotics. We present here a closed-form solution to the least-squares problem for three or more points. Currently, various empirical, graphical and numerical iterative methods are in use. Derivation of a closed-form solution can be simplified by using unit quaternions to represent rotation, as was shown in an earlier paper 1. Since orthonormal matrices are more widely used to represent rotation, we now present a solution using 3 × 3 matrices. Our method requires the computation of the square-root of a symmetric matrix. We compare the new result with an alternate method where orthonormality is not directly enforced. In this other method a best fit linear transformation is found and then the nearest orthonormal matrix chosen for the rotation. We note that the best translational offset is the difference between the centroid of the coordinates in one system and the rotated and scaled centroid of the coordinates in the other system. The best scale is equal to the ratio of the root-mean-square deviations of the coordinates in the two systems from their respective centroids. These exact results are to be preferred to approximate methods based on measurements of a few selected points.
2014 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), 2014
In this paper we study pose estimation for 1 general non-central cameras, using planar targets. The 2 method proposed uses non-minimal data. Using the ho-3 mography matrix to represent the transformation be-4 tween the world and camera coordinate systems, we de-5 scribe a non-iterative algorithm for pose estimation. To 6 improve the accuracy of the solutions, data-set normal-7 ization is used. In addition, we propose a parameter op-8 timization to refine the pose estimate. We evaluate the 9 proposed solutions against the state-of-the-art method 10 (for general targets) in terms of both robustness to noise 11 and computation time. From the experiments, we show 12 that the proposed method plus normalization is more 13 accurate against noise and less sensitive to variations 14 of the imaging device. We also show that the numeri-15 cal results obtained with this method improve with the 16 increasing number of data points. In terms of process-17 ing speed, the versions of the algorithm presented are 18 significantly faster than the state-of-the-art algorithm. 19 To further evaluate our method, we performed an ex-20 periment of a simple augmented reality application in 21 which we show that our method can be easily applied.
2010
The paper proposes a 3D reconstruction technique suitable in robot vision applications, based on the correspondence of image features established at a training step and the same image features extracted at the execution step. The result is a Euclidean transform to be used by robot head for reorientation at execution step. A closed form solution is proposed for (R, t) transform to initialize the non-linear optimization procedure. For applications where the object to be processed does not rotate around X or Y axis this can be the final solution.
1993
The authors present a method for robustly and accurately estimating the rotation and translation between a camera and a 3-D object from point and line correspondences. First they devise an error function and then show how to minimize this error function. The quadratic nature of this function is made possible by representing rotation and translation with a dual number quaternion. A detailed account is provided of the computational aspects of a trust-region optimization method. This method compares favourably with Newton's method, which has extensively been used to solve the problem, and with Faugeras-Toscani's linear method (1986) for calibrating a camera. Some experimental results are presented which demonstrate the robustness of the method with respect to image noise and matching errors
Image Analysis and Processing, 1997
The problem of a real-time pose estimation between a 3D scene and a camera is a fundamental task in most 3D computer vision and robotics applications such as object tracking, visual servoing, and virtual reality. In this paper we present a fast method for estimating the 3D pose using 2D to 3D point and line correspondences. This method is inspired by DeMenthon's method (1995) which consists of determining the pose from point correspondences. In this method the pose is iteratively improved with a weak perspective camera model, at convergence the computed pose corresponds to the perspective camera model. Our method is based on the iterative use of a paraperspective camera model which is a first order approximation of perspective. Experiments involving synthetic data as well as real range data indicate the feasibility and robustness of this method.
2013 IEEE International Conference on Computer Vision, 2013
We present a linear method for global camera pose registration from pairwise relative poses encoded in essential matrices. Our method minimizes an approximate geometric error to enforce the triangular relationship in camera triplets. This formulation does not suffer from the typical 'unbalanced scale' problem in linear methods relying on pairwise translation direction constraints, i.e. an algebraic error; nor the system degeneracy from collinear motion. In the case of three cameras, our method provides a good linear approximation of the trifocal tensor. It can be directly scaled up to register multiple cameras. The results obtained are accurate for point triangulation and can serve as a good initialization for final bundle adjustment. We evaluate the algorithm performance with different types of data and demonstrate its effectiveness. Our system produces good accuracy, robustness, and outperforms some well-known systems on efficiency.
This paper introduces two novel solutions to the generalized-camera exterior orientation problem, which has a vast number of potential applications in robotics: (i) a minimal solution requiring only three point correspondences, and (ii) gPnP, an efficient, non-iterative n-point solution with linear complexity in the number of points. Already existing minimal solutions require exhaustive algebraic derivations. In contrast, our novel minimal solution is solved in a straightforward manner using the Gröbner basis method. Existing n-point solutions are mostly based on iterative optimization schemes. Our n-point solution is non-iterative and outperforms existing algorithms in terms of computational efficiency. Our results present an evaluation against state-of-the-art single-camera algorithms, and a comparison of different multi-camera setups. It demonstrates the superior noise resilience achieved when using multi-camera configurations, and the efficiency of our algorithms. As a further contribution, we illustrate a possible robotic use-case of our non-perspective orientation computation algorithms by presenting visual odometry results on real data with a non-overlapping multi-camera configuration, including a comparison to a loosely coupled alternative.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003
Estimation of camera pose from an image of n points or lines with known correspondence is a thoroughly studied problem in computer vision. Most solutions are iterative and depend on nonlinear optimization of some geometric constraint, either on the world coordinates or on the projections to the image plane. For real-time applications, we are interested in linear or closed-form solutions free of initialization. We present a general framework which allows for a novel set of linear solutions to the pose estimation problem for both n points and n lines. We then analyze the sensitivity of our solutions to image noise and show that the sensitivity analysis can be used as a conservative predictor of error for our algorithms. We present a number of simulations which compare our results to two other recent linear algorithms, as well as to iterative approaches. We conclude with tests on real imagery in an augmented reality setup.
Keywords: Perspective-n -point problem (P n P) Absolute position and orientation Camera pose estimation Vision-based navigation Computer vision a b s t r a c t
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010
The vast majority of methods that successfully recover 3D structure from 2D images hinge on a preliminary identification of corresponding feature points. When the images capture close views, e.g., in a video sequence, corresponding points can be found by using local pattern matching methods. However, to better constrain the 3D inference problem, the views must be far apart, leading to challenging point matching problems. In the recent past, researchers have then dealt with the combinatorial explosion that arises when searching among N ! possible ways of matching N points. In this paper we overcome this search by making use of prior knowledge that is available in many situations: the orientation of the camera. This knowledge enables us to derive O(N 2) algorithms to compute point correspondences. We prove that our approach computes the correct solution when dealing with noiseless data and derive an heuristic that results robust to the measurement noise and the uncertainty in prior knowledge. Although we model the camera using orthography, our experiments illustrate that our method is able to deal with violations, including the perspective effects of general real images.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.