Autonomous Mobile Robots

  • 21 854 3
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

DK6033_half-series-title.qxd

2/23/06

8:37 AM

Page A

Autonomous Mobile Robots Sensing, Control, Decision Making and Applications

© 2006 by Taylor & Francis Group, LLC

DK6033_half-series-title.qxd

2/23/06

8:37 AM

Page B

CONTROL ENGINEERING A Series of Reference Books and Textbooks Editor FRANK L. LEWIS, PH.D. Professor Applied Control Engineering University of Manchester Institute of Science and Technology Manchester, United Kingdom

1. 2. 3. 4. 5. 6. 7.

8. 9. 10. 11. 12. 13. 14. 15. 16. 17.

Nonlinear Control of Electric Machinery, Darren M. Dawson, Jun Hu, and Timothy C. Burg Computational Intelligence in Control Engineering, Robert E. King Quantitative Feedback Theory: Fundamentals and Applications, Constantine H. Houpis and Steven J. Rasmussen Self-Learning Control of Finite Markov Chains, A. S. Poznyak, K. Najim, and E. Gómez-Ramírez Robust Control and Filtering for Time-Delay Systems, Magdi S. Mahmoud Classical Feedback Control: With MATLAB®, Boris J. Lurie and Paul J. Enright Optimal Control of Singularly Perturbed Linear Systems and Applications: High-Accuracy Techniques, Zoran Gajif and Myo-Taeg Lim Engineering System Dynamics: A Unified Graph-Centered Approach, Forbes T. Brown Advanced Process Identification and Control, Enso Ikonen and Kaddour Najim Modern Control Engineering, P. N. Paraskevopoulos Sliding Mode Control in Engineering, edited by Wilfrid Perruquetti and Jean-Pierre Barbot Actuator Saturation Control, edited by Vikram Kapila and Karolos M. Grigoriadis Nonlinear Control Systems, Zoran Vukić, Ljubomir Kuljača, Dali Donlagič, and Sejid Tesnjak Linear Control System Analysis & Design: Fifth Edition, John D’Azzo, Constantine H. Houpis and Stuart Sheldon Robot Manipulator Control: Theory & Practice, Second Edition, Frank L. Lewis, Darren M. Dawson, and Chaouki Abdallah Robust Control System Design: Advanced State Space Techniques, Second Edition, Chia-Chi Tsui Differentially Flat Systems, Hebertt Sira-Ramirez and Sunil Kumar Agrawal © 2006 by Taylor & Francis Group, LLC

DK6033_half-series-title.qxd

2/23/06

8:37 AM

Page C

18. Chaos in Automatic Control, edited by Wilfrid Perruquetti and Jean-Pierre Barbot 19. Fuzzy Controller Design: Theory and Applications, Zdenko Kovacic and Stjepan Bogdan 20. Quantitative Feedback Theory: Fundamentals and Applications, Second Edition, Constantine H. Houpis, Steven J. Rasmussen, and Mario Garcia-Sanz 21. Neural Network Control of Nonlinear Discrete-Time Systems, Jagannathan Sarangapani 22. Autonomous Mobile Robots: Sensing, Control, Decision Making and Applications, edited by Shuzhi Sam Ge and Frank L. Lewis

© 2006 by Taylor & Francis Group, LLC

DK6033_half-series-title.qxd

2/23/06

8:37 AM

Page i

Autonomous Mobile Robots Sensing, Control, Decision Making and Applications

Shuzhi Sam Ge The National University of Singapore

Frank L. Lewis Automation and Robotics Research Institute The University of Texas at Arlington

Boca Raton London New York

CRC is an imprint of the Taylor & Francis Group, an informa business

© 2006 by Taylor & Francis Group, LLC

MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of MATLAB® software or related products does not constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular use of the MATLAB® software.

Published in 2006 by CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2006 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-10: 0-8493-3748-8 (Hardcover) International Standard Book Number-13: 978-0-8493-3748-2 (Hardcover) This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data Catalog record is available from the Library of Congress

Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com Taylor & Francis Group is the Academic Division of Informa plc.

and the CRC Press Web site at http://www.crcpress.com

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page x — #10

Preface The creation of a truly autonomous and intelligent system — one that can sense, learn from, and interact with its environment, one that can integrate seamlessly into the day-to-day lives of humans — has ever been the motivating factor behind the huge body of work on artificial intelligence, control theory and robotics, autonomous (land, sea, and air) vehicles, and numerous other disciplines. The technology involved is highly complex and multidisciplinary, posing immense challenges for researchers at both the module and system integration levels. Despite the innumerable hurdles, the research community has, as a whole, made great progress in recent years. This is evidenced by technological leaps and innovations in the areas of sensing and sensor fusion, modeling and control, map building and path planning, artificial intelligence and decision making, and system architecture design, spurred on by advances in related areas of communications, machine processing, networking, and information technology. Autonomous systems are gradually becoming a part of our way of life, whether we consciously perceive it or not. The increased use of intelligent robotic systems in current indoor and outdoor applications bears testimony to the efforts made by researchers on all fronts. Mobile systems have greater autonomy than before, and new applications abound — ranging from factory transport systems, airport transport systems, road/vehicular systems, to military applications, automated patrol systems, homeland security surveillance, and rescue operations. While most conventional autonomous systems are self-contained in the sense that all their sensors, actuators, and computers are on board, it is envisioned that more and more will evolve to become open networked systems with distributed processing power, sensors (e.g., GPS, cameras, microphones, and landmarks), and actuators. It is generally agreed that an autonomous system consists primarily of the following four distinct yet interconnected modules: (i) (ii) (iii) (iv)

Sensors and Sensor Fusion Modeling and Control Map Building and Path Planning Decision Making and Autonomy

These modules are integrated and influenced by the system architecture design for different applications. vii

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page vii — #7

viii

Preface

This edited book tries for the first time to provide a comprehensive treatment of autonomous mobile systems, ranging from related fundamental technical issues to practical system integration and applications. The chapters are written by some of the leading researchers and practitioners working in this field today. Readers will be presented with a complete picture of autonomous mobile systems at the systems level, and will also gain a better understanding of the technological and theoretical aspects involved within each module that composes the overall system. Five distinct parts of the book, each consisting of several chapters, emphasize the different aspects of autonomous mobile systems, starting from sensors and control, and gradually moving up the cognitive ladder to planning and decision making, finally ending with the integration of the four modules in application case studies of autonomous systems. The first part of the book is dedicated to sensors and sensor fusion. The four chapters treat in detail the operation and uses of various sensors that are crucial for the operation of autonomous systems. Sensors provide robots with the capability to perceive the world, and effective utilization is of utmost importance. The chapters also consider various state-of-the art techniques for the fusion and utilization of various sensing information for feature detection and position estimation. Vision sensors, RADAR, GPS and INS, and landmarks are discussed in detail in Chapters 1 to 4 respectively. Modeling and control issues concerning nonholonomic systems are discussed in the second part of the book. Real-world systems seldom present themselves in the form amenable to analysis as holonomic systems, and the importance of nonholonomic modeling and control is evident. The four chapters of this part, Chapters 5 to 8, thus present novel contributions to the control of these highly complicated systems, focusing on discontinuous control, unified neural fuzzy control, adaptive control with actuator dynamics, and the control of car-like vehicles for vehicle tracking maneuvers, respectively. The third part of the book covers the map building and path planning aspects of autonomous systems. This builds on technologies in sensing and control to further improve the intelligence and autonomy of mobile robots. Chapter 9 discusses the specifics of building an accurate map of the environment, using either single or multiple robots, with which localization and motion planning can take place. Probabilistic motion planning as a robust and efficient planning scheme is examined in Chapter 10. Action coordination and formation control of multiple robots are investigated in Chapter 11. Decision making and autonomy, the highest levels in the hierarchy of abstraction, are examined in detail in the fourth part of the book. The three chapters in this part treat in detail the issues of representing knowledge, high level planning, and coordination mechanisms that together define the cognitive capabilities of autonomous systems. These issues are crucial for the development of intelligent mobile systems that are able to reason and manipulate available information. Specifically, Chapters 12 to 14 present topics pertaining

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page viii — #8

Preface

ix

to knowledge representation and decision making, algorithms for planning under uncertainties, and the behavior-based coordination of multiple robots. In the final part of the book, we present a collection of chapters that deal with the system integration and engineering aspects of large-scale autonomous systems. These are usually considered as necessary steps in making new technologies operational and are relatively neglected in the academic community. However, there is no doubt that system integration plays a vital role in the successful development and deployment of autonomous mobile systems. Chapters 15 and 16 examine the issues involved in the design of autonomous commercial robots and automotive systems, respectively. Chapter 17 presents a hierarchical system architecture that encompasses and links the various (higher and lower level) components to form an intelligent, complex system. We sincerely hope that this book will provide the reader with a cohesive picture of the diverse, yet intimately related, issues involved in bringing about truly intelligent autonomous robots. Although the treatment of the topics is by no means exhaustive, we hope to give the readers a broad-enough view of the various aspects involved in the development of autonomous systems. The authors have, however, provided a splendid list of references at the end of each chapter, and interested readers are encouraged to refer to these references for more information. This book represents the amalgamation of the truly excellent work and effort of all the contributing authors, and could not have come to fruition without their contributions. Finally, we are also immensely grateful to Marsha Pronin, Michael Slaughter, and all others at CRC Press (Taylor & Francis Group) for their efforts in making this project a success.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page ix — #9

Editors Shuzhi Sam Ge, IEEE Fellow, is a full professor with the Electrical and Computer Engineering Department at the National University of Singapore. He earned the B.Sc. degree from the Beijing University of Aeronautics and Astronautics (BUAA) in 1986, and the Ph.D. degree and the Diploma of Imperial College (DIC) from the Imperial College of Science, Technology and Medicine in 1993. His current research interests are in the control of nonlinear systems, hybrid systems, neural/fuzzy systems, robotics, sensor fusion, and real-time implementation. He has authored and co-authored over 200 international journal and conference papers, 3 monographs and co-invented 3 patents. He was the recipient of a number of prestigious research awards, and has been serving as the editor and associate editor of a number of flagship international journals. He is also serving as a technical consultant for the local industry. Frank L. Lewis, IEEE Fellow, PE Texas, is a distinguished scholar professor and Moncrief-O’Donnell chair at the University of Texas at Arlington. He earned the B.Sc. degree in physics and electrical engineering and the M.S.E.E. at Rice University, the M.S. in Aeronautical Engineering from the University of West Florida, and the Ph.D. at the Georgia Institute of Technology. He works in feedback control and intelligent systems. He is the author of 4 U.S. patents, 160 journal papers, 240 conference papers, and 9 books. He received the Fulbright Research Award, the NSF Research Initiation Grant, and the ASEE Terman Award. He was selected as Engineer of the Year in 1994 by the Fort Worth IEEE Section and is listed in the Fort Worth Business Press Top 200 Leaders in Manufacturing. He was appointed to the NAE Committee on Space Station in 1995. He is an elected guest consulting professor at both Shanghai Jiao Tong University and South China University of Technology.

xi

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page xi — #11

Contributors Martin Adams School of Electrical and Electronic Engineering Nanyang Technological University Singapore James S. Albus National Institute of Standards and Technology Gaithersburg, Maryland Alessandro Astolfi Electrical and Electronics Engineering Department Imperial College London London, UK Stephen Balakirsky Intelligent Systems Division National Institute of Standards and Technology Gaithersburg, Maryland Anthony Barbera National Institute of Standards and Technology Gaithersburg, Maryland José A. Castellanos Instituto de Investigación en Ingeniería de Aragón Universidad de Zaragoza Zaragoza, Spain Luiz Chaimowicz Computer Science Department Federal University of Minas Gerais, Brazil

Jingrong Cheng Department of Electrical Engineering University of California Riverside, California Peng Cheng Department of Computer Science University of Illinois Urbana-Champaign, Illinois Sesh Commuri School of Electrical & Computer Engineering University of Oklahoma Norman, Oklahoma Jay A. Farrell Department of Electrical Engineering University of California Riverside, California Rafael Fierro MARHES Laboratory School of Electrical & Computer Engineering Oklahoma State University Norman, Oklahoma Shuzhi Sam Ge Department of Electrical and Computer Engineering National University of Singapore Singapore Héctor H. González-Baños Honda Research Institute USA, Inc. Mountain View, California xiii

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page xiii — #13

xiv

Contributors

Fan Hong Department of Electrical and Computer Engineering National University of Singapore Singapore

Tong Heng Lee Department of Electrical and Computer Engineering National University of Singapore Singapore

David Hsu Department of Computer Science National University of Singapore Singapore

Frank L. Lewis Automation and Robotics Research Institute University of Texas Arlington, Texas

Huosheng Hu Department of Computer Science University of Essex Colchester, UK Chris Jones Computer Science Department University of Southern California Los Angeles, California Ebi Jose School of Electrical and Electronic Engineering Nanyang Technological University Singapore Vijay Kumar Department of Mechanical Engineering and Applied Mechanics University of Pennsylvania Philadelphia, Pennsylvania

Yu Lu Department of Electrical Engineering University of California Riverside, California Maja J. Matari´c Computer Science Department University of Southern California Los Angeles, California Elena Messina Intelligent Systems Division National Institute of Standards and Technology Gaithersburg, Maryland Mario E. Munich Evolution Robotics Inc. Pasadena, California

Jean-Claude Latombe Department of Computer Science Stanford University Palo Alto, California

José Neira Instituto de Investigación en Ingeniería de Aragón Universidad de Zaragoza Zaragoza, Spain

Steven M. LaValle Department of Computer Science University of Illinois Urbana-Champaign, Illinois

Jason M. O’Kane Department of Computer Science University of Illinois Urbana-Champaign, Illinois

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page xiv — #14

xv

Contributors

James P. Ostrowski Evolution Robotics Inc. Pasadena, California Michel R. Parent IMARA Group INRIA-Rocquencourt Le Chesnay, France Stéphane R. Petti Aisin AW Europe Braine-L’Alleud, Belgium Minhtuan Pham School of Electrical and Electronics Engineering Nanyang Technological University Singapore Paolo Pirjanian Evolution Robotics Inc. Pasadena, California Julian Ryde Department of Computer Science University of Essex Colchester, UK Andrew Shacklock Singapore Institute of Manufacturing Technology Singapore

Juan D. Tardós Instituto de Investigación en Ingeniería de Aragón Universidad de Zaragoza Zaragoza, Spain Elmer R. Thomas Department of Electrical Engineering University of California Riverside, California Benjamín Tovar Department of Computer Science University of Illinois Urbana-Champaign, Illinois Danwei Wang School of Electrical and Electronics Engineering Nanyang Technological University Singapore Han Wang School of Electrical and Electronics Engineering Nanyang Technological University Singapore

Jiali Shen Department of Computer Science University of Essex Colchester, UK

Zhuping Wang Department of Electrical and Computer Engineering National University of Singapore Singapore

Chun-Yi Su Department of Mechanical Engineering Concordia University Montreal, Quebec, Canada

Jian Xu Singapore Institute of Manufacturing Technology Singapore

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page xv — #15

Abstract As technology advances, it has been envisioned that in the very near future, robotic systems will become part and parcel of our everyday lives. Even at the current stage of development, semi-autonomous or fully automated robots are already indispensable in a staggering number of applications. To bring forth a generation of truly autonomous and intelligent robotic systems that will meld effortlessly into the human society involves research and development on several levels, from robot perception, to control, to abstract reasoning. This book tries for the first time to provide a comprehensive treatment of autonomous mobile systems, ranging from fundamental technical issues to practical system integration and applications. The chapters are written by some of the leading researchers and practitioners working in this field today. Readers will be presented with a coherent picture of autonomous mobile systems at the systems level, and will also gain a better understanding of the technological and theoretical aspects involved within each module that composes the overall system. Five distinct parts of the book, each consisting of several chapters, emphasize the different aspects of autonomous mobile systems, starting from sensors and control, and gradually moving up the cognitive ladder to planning and decision making, finally ending with the integration of the four modules in application case studies of autonomous systems. This book is primarily intended for researchers, engineers, and graduate students involved in all aspects of autonomous mobile robot systems design and development. Undergraduate students may also find the book useful, as a complementary reading, in providing a general outlook of the various issues and levels involved in autonomous robotic system design.

xvii

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page xvii — #17

Contents I

Sensors and Sensor Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

Chapter 1 Visual Guidance for Autonomous Vehicles: Capability and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andrew Shacklock, Jian Xu, and Han Wang

5

Chapter 2 Millimeter Wave RADAR Power-Range Spectra Interpretation for Multiple Feature Detection . . . . . . . . . . . . . . . . . Martin Adams and Ebi Jose

41

Chapter 3 Data Fusion via Kalman Filter: GPS and INS . . . . . . . . . . . . . . . . . Jingrong Cheng, Yu Lu, Elmer R. Thomas, and Jay A. Farrell

99

Chapter 4 Landmarks and Triangulation in Navigation . . . . . . . . . . . . . . . . . . 149 Huosheng Hu, Julian Ryde, and Jiali Shen

II

Modeling and Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

Chapter 5 Stabilization of Nonholonomic Systems . . . . . . . . . . . . . . . . . . . . . . . 191 Alessandro Astolfi Chapter 6 Adaptive Neural-Fuzzy Control of Nonholonomic Mobile Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Fan Hong, Shuzhi Sam Ge, Frank L. Lewis, and Tong Heng Lee Chapter 7 Adaptive Control of Mobile Robots Including Actuator Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Zhuping Wang, Chun-Yi Su, and Shuzhi Sam Ge

xix

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page xix — #19

xx

Contents

Chapter 8 Unified Control Design for Autonomous Car-Like Vehicle Tracking Maneuvers . . . . . . . . . . . . . . . . . . . . . . . . 295 Danwei Wang and Minhtuan Pham

III

Map Building and Path Planning . . . . . . . . . . . . . . . . . . . . . . . . . 331

Chapter 9 Map Building and SLAM Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 335 José A. Castellanos, José Neira, and Juan D. Tardós Chapter 10 Motion Planning: Recent Developments. . . . . . . . . . . . . . . . . . . . . . . 373 Héctor H. González-Baños, David Hsu, and Jean-Claude Latombe Chapter 11 Multi-Robot Cooperation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 Rafael Fierro, Luiz Chaimowicz, and Vijay Kumar

IV

Decision Making and Autonomy . . . . . . . . . . . . . . . . . . . . . . . . . 461

Chapter 12 Knowledge Representation and Decision Making for Mobile Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 Elena Messina and Stephen Balakirsky Chapter 13 Algorithms for Planning under Uncertainty in Prediction and Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 Jason M. O’Kane, Benjamín Tovar, Peng Cheng, and Steven M. LaValle Chapter 14 Behavior-Based Coordination in Multi-Robot Systems. . . . . . . 549 Chris Jones and Maja J. Matari´c

V

System Integration and Applications . . . . . . . . . . . . . . . . . . . . . 571

Chapter 15 Integration for Complex Consumer Robotic Systems: Case Studies and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 Mario E. Munich, James P. Ostrowski, and Paolo Pirjanian

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page xx — #20

Contents

xxi

Chapter 16 Automotive Systems/Robotic Vehicles . . . . . . . . . . . . . . . . . . . . . . . . 613 Michel R. Parent and Stéphane R. Petti Chapter 17 Intelligent Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655 Sesh Commuri, James S. Albus, and Anthony Barbera

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c000” — 2006/3/31 — 16:42 — page xxi — #21

I Sensors and Sensor Fusion

Mobile robots participate in meaningful and intelligent interactions with other entities — inanimate objects, human users, or other robots — through sensing and perception. Sensing capabilities are tightly linked to the ability to perceive, without which sensor data will only be a collection of meaningless figures. Sensors are crucial to the operation of autonomous mobile robots in unknown and dynamic environments where it is impossible to have complete a priori information that can be given to the robots before operation. In biological systems, visual sensing offers a rich source of information to individuals, which in turn use such information for navigation, deliberation, and planning. The same may be said of autonomous mobile robotic systems, where vision has become a standard sensory tool on robots. This is especially so with the advancement of image processing techniques, which facilitates the extraction of even more useful information from images captured from mounted still or moving cameras. The first chapter of this part therefore, focuses on the use of visual sensors for guidance and navigation of unmanned vehicles. This chapter starts with an analysis of the various requirements that the use of unmanned vehicles poses to the visual guidance equipment. This is followed by an analysis of the characteristics and limitations of visual perception hardware, providing readers with an understanding of the physical constraints that must be considered in the design of guidance systems. Various techniques currently in use for road and vehicle following, and for obstacle detection are then reviewed. With the wealth of information afforded by various visual sensors, sensor fusion techniques play an important role in exploiting the available information to 1

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 1 — #1

2

Autonomous Mobile Robots

further improve the perceptual capabilities of systems. This issue is discussed, with examples on the fusion of image data with LADAR information. The chapter concludes with a discussion on the open problems and challenges in the area of visual perception. Where visual sensing is insufficient, other sensors serve as additional sources of information, and are equally important in improving the navigational and perceptual capabilities of autonomous robots. The use of millimeter wave RADAR for performing feature detection and navigation is treated in detail in the second chapter of this part. Millimeter wave RADAR is capable of providing high-fidelity range information when vision sensors fail under poor visibility conditions, and is therefore, a useful tool for robots to use in perceiving their environment. The chapter first deals with the analysis and characterization of noise affecting the measurements of millimeter wave RADAR. A method is then proposed for the accurate prediction of range spectra. This is followed by the description of a robust algorithm, based on target presence probability, to improve feature detection in highly cluttered environments. Aside from providing robots with a view of the environment it is immersed in, certain sensors also give robots the ability to analyze and evaluate its own state, namely, its position. Augmentation of such information with those garnered from environmental perception further provides robots with a clearer picture of the condition of its environment and the robot’s own role within it. While visual perception may be used for localization, the use of internal and external sensors, like the Inertial Navigation System (INS) and the Global Positioning System (GPS), allows refinement of estimated values. The third chapter of this part treats, in detail, the use of both INS and GPS for position estimation. This chapter first provides a comprehensive review of the Extended Kalman Filter (EKF), as well as the basics of GPS and INS. Detailed treatment of the use of the EKF in fusing measurements from GPS and INS is then provided, followed by a discussion of various approaches that have been proposed for the fusion of GPS and INS. In addition to internal and external explicit measurements, landmarks in the environment may also be utilized by the robots to get a sense of where they are. This may be done through triangulation techniques, which are described in the final chapter of this part. Recognition of landmarks may be performed by the visual sensors, and localization is achieved through the association of landmarks with those in internal maps, thereby providing position estimates. The chapter provides descriptions and experimental results of several different techniques for landmark-based position estimation. Different landmarks are used, ranging from laser beacons to visually distinct landmarks, to moveable landmarks mounted on robots for multi-robot localization. This part of the book aims to provide readers with an understanding of the theoretical and practical issues involved in the use of sensors, and the important role sensors play in determining (and limiting) the degree of autonomy mobile

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 2 — #2

Sensors and Sensor Fusion

3

robots possess. These sensors allow robots to obtain a basic set of observations upon which controllers and higher level decision-making mechanisms can act upon, thus forming an indispensable link in the chain of modules that together constitutes an intelligent, autonomous robotic system.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 3 — #3

1

Visual Guidance for Autonomous Vehicles: Capability and Challenges Andrew Shacklock, Jian Xu, and Han Wang

CONTENTS 1.1

1.2

1.3

1.4

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Context. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Classes of UGV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Visual Sensing Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Visual Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1.1 Passive imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1.2 Active sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Modeling of Image Formation and Calibration . . . . . . . . . . . . . . 1.2.2.1 The ideal pinhole model . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2.2 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Visual Guidance Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 World Model Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.3 Physical Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4 Road and Vehicle Following . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4.1 State-of-the-art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4.2 A road camera model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.5 Obstacle Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.5.1 Obstacle detection using range data . . . . . . . . . . . . . . . 1.3.5.2 Stereo vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.5.3 Application examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.6 Sensor Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Challenges and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Terrain Classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 6 7 8 8 9 10 12 12 13 15 15 15 17 19 19 21 23 23 24 26 28 33 33 5

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 5 — #5

6

Autonomous Mobile Robots

1.4.2 Localization and 3D Model Building from Vision . . . . . . . . . . 1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Biographies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34 36 37 37 40

1.1 INTRODUCTION 1.1.1 Context Current efforts in the research and development of visual guidance technology for autonomous vehicles fit into two major categories: unmanned ground vehicles (UGVs) and intelligent transport systems (ITSs). UGVs are primarily concerned with off-road navigation and terrain mapping whereas ITS (or automated highway systems) research is a much broader area concerned with safer and more efficient transport in structured or urban settings. The focus of this chapter is on visual guidance and therefore will not dwell on the definitions of autonomous vehicles other than to examine how they set the following roles of vision systems: • • • •

Detection and following of a road Detection of obstacles Detection and tracking of other vehicles Detection and identification of landmarks

These four tasks are relevant to both UGV and ITS applications, although the environments are quite different. Our experience is in the development and testing of UGVs and so we concentrate on these specific problems in this chapter. We refer to achievements in structured settings, such as road-following, as the underlying principles are similar, and also because they are a good starting point when facing complexity of autonomy in open terrain. This introductory section continues with an examination of the expectations of UGVs as laid out by the Committee on Army Unmanned Ground Vehicle Technology in its 2002 road map [1]. Next, in Section 1.2, we give an overview of the key technologies for visual guidance: two-dimensional (2D) passive imaging and active scanning. The aim is to highlight the differences between various options with regard to our task-specific requirements. Section 1.3 constitutes the main content of this chapter; here we present a visual guidance system (VGS) and its modules for guidance and obstacle detection. Descriptions concentrate on pragmatic approaches adopted in light of the highly complex and uncertain tasks which stretch the physical limitations of sensory systems. Examples

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 6 — #6

Visual Guidance for Autonomous Vehicles

7

are given from stereo vision and image–ladar integration. The chapter ends by returning to the road map in Section 1.4 and examining the potential role of visual sensors in meeting the key challenges for autonomy in unstructured settings: terrain classification and localization/mapping.

1.1.2 Classes of UGV The motivation or driving force behind UGV research is for military application. This fact is made clear by examining the sources of funding behind prominent research projects. The DARPA Grand Challenge is an immediate example at hand [2]. An examination of military requirements is a good starting point, in an attempt to understand what a UGV is and how computer vision can play a part in it, because the requirements are well defined. Another reason is that as we shall see the scope and classification of UGVs from the U.S. military is still quite broad and, therefore, encompasses many of the issues related to autonomous vehicle technology. A third reason is that the requirements for survivability in hostile environments are explicit, and therefore developers are forced to face the toughest problems that will drive and test the efficacy of visual perception research. These set the much needed benchmarks against which we can assess performance and identify the most pressing problems. The definitions of various UGVs and reviews of state-of-the-art are available in the aforementioned road map [1]. This document is a valuable source for anyone involved in autonomous vehicle research and development because the future requirements and capability gaps are clearly set out. The report categorizes four classes of vehicles with increasing autonomy and perception requirements: Teleoperated Ground Vehicle (TGV). Sensors enable an operator to visualize location and movement. No machine cognition is needed, but experience has shown that remote driving is a difficult task and augmentation of views with some of the functionality of automatic vision would help the operator. Fong [3] is a good source for the reader interested in vehicle teleoperation and collaborative control. Semi-Autonomous Preceder–Follower (SAP/F). These devices are envisaged for logistics and equipment carrying. They require advanced navigation capability to minimize operator interaction, for example, the ability to select a traversable path in A-to-B mobility. Platform-Centric AGV (PC-AGV). This is a system that has the autonomy to complete a task. In addition to simple mobility, the system must include extra terrain reasoning for survivability and self-defense. Network-Centric AGV (NC-AGV). This refers to systems that operate as nodes in tactical warfare. Their perception needs are similar to that of PC-AGVs but with better cognition so that, for example, potential attackers can be distinguished.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 7 — #7

8

Autonomous Mobile Robots

TABLE 1.1 Classes of UGV Class Searcher (TGV) Donkey (SAP/F)

kph 40

Wingman (PC-AGV)

100

Hunter-killer (NC-AGV)

120

Capability gaps All-weather sensors Localization and mapping algorithms Long-range sensors and sensors for classifying vegetation Multiple sensors and fusion

Perception tasks

TRL 6

Not applicable Detect static obstacles, traversable paths Terrain assessment to detect potential cover

2006 2009

Identification of enemy forces, situation awareness

2025

2015

The road map identifies perception as the priority area for development and defines increasing levels of “technology readiness.” Some of the requirements and capability gaps for the four classes are summarized and presented in Table 1.1. Technology readiness level 6 (TRL 6) is defined as the point when a technology component has been demonstrated in a relevant environment. These roles range from the rather dumb donkey-type device used to carry equipment to autonomous lethal systems making tactical decisions in open country. It must be remembered, as exemplified in the inaugural Grand Challenge, that the technology readiness levels of most research is a long way from meeting the most simple of these requirements. The Challenge is equivalent to a simple A-to-B mobility task for the SAP/F class of UGVs. On a more positive note, the complexity of the Grand Challenge should not be understated, and many past research programs, such as Demo III, have demonstrated impressive capability. Such challenges, with clearly defined objectives, are essential for making progress as they bring critical problems to the fore and provide a common benchmark for evaluating technology.

1.2 VISUAL SENSING TECHNOLOGY 1.2.1 Visual Sensors We first distinguish between passive and active sensor systems: A passive sensor system relies upon ambient radiation, whereas an active sensor system illuminates the scene with radiation (often laser beams) and determines how this is reflected by the surroundings. Active sensors offer a clear advantage in outdoor applications; they are less sensitive to changes in ambient conditions. However, some applications preclude their use; they can be detected by the enemy

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 8 — #8

Visual Guidance for Autonomous Vehicles

9

in military scenarios, or there may be too many conflicting sources in a civilian setting. At this point we also highlight a distinction between the terms “active vision” and “active sensors.” Active vision refers to techniques in which (passive) cameras are moved so that they can fixate on particular features [4]. These have applications in robot localization, terrain mapping, and driving in cluttered environments. 1.2.1.1 Passive imaging From the application and performance standpoint, our primary concern is procuring hardware that will acquire good quality data for input to guidance algorithms; so we now highlight some important considerations when specifying a camera for passive imaging in outdoor environments. The image sensor (CCD or CMOS). CMOS technology offers certain advantages over the more familiar CCDs in that it allows direct access to individual blocks of pixels much as would be done in reading computer memory. This enables instantaneous viewing of regions of interest (ROI) without the integration time, clocking, and shift registers of standard CCD sensors. A key advantage of CMOS is that additional circuitry can be built into the silicon which leads to improved functionality and performance: direct digital output, reduced blooming, increased dynamic range, and so on. Dynamic range becomes important when viewing outdoor scenes with varying illumination: for example, mixed scenes of open ground and shadow. Color or monochrome. Monochrome (B&W) cameras are widely used in lane-following systems but color systems are often needed in off-road (or country track) environments where there is poor contrast in detecting traversable terrain. Once we have captured a color image there are different methods of representing the RGB components: for example, the RGB values can be converted into hue, saturation, and intensity (HSI) [5]. The hue component of a surface is effectively invariant to illumination levels which can be important when segmenting images with areas of shadow [6,7]. Infrared (IR). Figure 1.1 shows some views from our semi-urban scene test circuit captured with an IR camera. The hot road surface is quite distinct as are metallic features such as manhole covers and lampposts. Trees similarly contrast well against the sky but in open country after rainfall, different types of vegetation and ground surfaces exhibit poor contrast. The camera works on a different transducer principle from the photosensors in CCD or CMOS chips. Radiation from hot bodies is projected onto elements in an array that heat up, and this temperature change is converted into an electrical signal. At present, compared to visible light cameras, the resolution is reduced (e.g., 320 × 240 pixels) and the response is naturally slower. There are other problems to contend with, such as calibration and drift of the sensor. IR cameras are expensive

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 9 — #9

10

Autonomous Mobile Robots

FIGURE 1.1 A selection of images captured with an IR camera. The temperature of surfaces gives an alternative and complementary method of scene classification compared to standard imaging. Note the severe lens distortion.

and there are restrictions on their purchase. However, it is now possible to install commercial night-vision systems on road vehicles: General Motors offers a thermal imaging system with head-up display (HUD) as an option on the Cadillac DeVille. The obvious application for IR cameras is in night driving but they are useful in daylight too, as they offer an alternative (or complementary) way of segmenting scenes based on temperature levels. Catadioptric cameras. In recent years we have witnessed the increasing use of catadioptric1 cameras. These devices, also referred to as omnidirectional, are able to view a complete hemisphere with the use of a parabolic mirror [8]. Practically, they work well in structured environments due to the way straight lines project to circles. Bosse [9] uses them indoors and outdoors and tracks the location of vanishing points in a structure from motion (SFM) scheme. 1.2.1.2 Active sensors A brief glimpse through robotics conference proceedings is enough to demonstrate just how popular and useful laser scanning devices, such as the ubiquitous SICK, are in mobile robotics. These devices are known as LADAR and are 1 Combining reflection and refraction; that is, a mirror and lens.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 10 — #10

Visual Guidance for Autonomous Vehicles

11

available in 2D and 3D versions but the principles are essentially similar: a laser beam is scanned within a certain region; if it reflects back to the sensor off an obstacle, the time-of-flight (TOF) is measured. 2D scanning. The majority of devices used on mobile robots scan (pan) through 180◦ in about 13 msec at an angular resolution of 1◦ . Higher resolution is obtained by slowing the scan, so at 0.25◦ resolution, the scan will take about 52 msec. The sensor thus measures both range and bearing {r, θ} of obstacles in the half plane in front of it. On a moving vehicle the device can be inclined at an angle to the direction of travel so that the plane sweeps out a volume as the vehicle moves. It is common to use two devices: one pointing ahead to detect obstacles at a distance (max. range ∼80 m); and one inclined downward to gather 3D data from the road, kerb, and nearby obstacles. Such devices are popular because they work in most conditions and the information is easy to process. The data is relatively sparse over a wide area and so is suitable for applications such as localization and mapping (Section 1.4.2). A complication, in off-road applications, is caused by pitching of the vehicle on rough terrain: this creates spurious data points as the sensor plane intersects the ground plane. Outdoor feature extraction is still regarded as a very difficult task with 2D ladar as the scan data does not have sufficient resolution, field-of-view (FOV), and data rates [10]. 3D scanning. To measure 3D data, the beam must be steered though an additional axis (tilt) to capture spherical coordinates {r, θ, φ: range, pan, tilt}. There are many variations on how this can be achieved as an optoelectromechanical system: rotating prisms, polygonal mirrors, or galvonometric scanners are common. Another consideration is the order of scan; one option is to scan vertically and after each scan to increment the pan angle to the next vertical column. As commercial 3D systems are very expensive, many researchers augment commercial 2D devices with an extra axis, either by deflecting the beam with an external mirror or by rotating the complete sensor housing [11]. It is clear that whatever be the scanning method, it will take a protracted length of time to acquire a dense 3D point cloud. High-resolution scans used in construction and surveying can take between 20 and 90 min to complete a single frame, compared to the 10 Hz required for a real-time navigation system [12]. There is an inevitable compromise to be made between resolution and frame rate with scanning devices. The next generation of ladars will incorporate flash technology, in which a complete frame is acquired simultaneously on a focal plane array (FPA). This requires that individual sensing elements on the array incorporate timing circuitry. The current limitation of FLASH/FPA is the number of pixels in the array, which means that the FOV is still small, but this can be improved by panning and tilting of the sensor between subframes, and then creating a composite image.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 11 — #11

12

Autonomous Mobile Robots

In summary, ladar offers considerable advantages over passive imaging but there remain many technical difficulties to be overcome before they can meet the tough requirements for vehicle guidance. The advantages are: • Unambiguous 3D measurement over wide FOV and distances • Undiminished night-time performance and tolerance to adverse weather conditions The limitations are: • Relatively high cost, bulky, and heavy systems • Limited spatial resolution and low frame rates • Acquisition of phantom points or multiple points at edges or permeable surfaces • Active systems may be unacceptable in certain applications The important characteristics to consider, when selecting a ladar for a guidance application, are: angular resolution, range accuracy, frame rate, and cost. An excellent review of ladar technology and next generation requirements is provided by Stone at NIST [12].

1.2.2 Modeling of Image Formation and Calibration 1.2.2.1 The ideal pinhole model It is worthwhile to introduce the concept of projection and geometry and some notation as this is used extensively in visual sensing techniques such as stereo and structure from motion. Detail is kept to a minimum and the reader is referred to standard texts on computer vision for more information [13–15]. The standard pinhole camera model is adopted, while keeping in mind the underlying assumptions and that it is an ideal model. A point in 3D space {X˜ ∈ R3 } projects to a point on the 2D image plane {˜x ∈ R2 } according to the following equation: x = PX: P ∈ R3×4

(1.1)

This equation is linear because we use homogeneous coordinates by augmenting the position vectors with a scalar (X = [X˜ T 1]T ∈ R4 ) and likewise the image point (x = [x y w]T ∈ R3 : x˜ = x/w). A powerful and more natural way of treating image formation is to consider the ray model as an example of projective space. P is the projection matrix and encodes the position of the

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 12 — #12

Visual Guidance for Autonomous Vehicles

13

camera and its intrinsic parameters. We can rewrite (1.1) as: x = K[R

˜ K ∈ R3×3 , T ]X:

R ∈ SO(3),

T ∈ R3

(1.2)

Internal (or intrinsic) parameters. These are contained in the calibration matrix K, which can be parameterized by: focal length (f ), aspect ratio (α), skew (s), and the location of the offset of the principal point in the image {u0 , v0 }. 

f

 K = 0 0

s

u0



αf

 v0 

0

1

(1.3)

External (or extrinsic) parameters. These are the orientation and position of the camera with respect to the reference system: R and T in Equation 1.2. 1.2.2.2 Calibration We can satisfy many vision tasks working with image coordinates alone and a projective representation of the scene. If we want to use our cameras as measurement devices, or if we want to incorporate realistic dynamics in motion models, or to fuse data in a common coordinate system, we need to upgrade from a projective to Euclidean space: that is, calibrate and determine the parameters. Another important reason for calibration is that the wide-angle lenses, commonly used in vehicle guidance, are subject to marked lens distortion (see Figure 1.1); without correction, this violates the assumptions of the ideal pinhole model. A radial distortion factor is calculated from the coefficients {ki } and the radial distance (r) of a pixel from the center {xp , yp }. δ(r) = 1 + k1 r 2 + k2 r 4 : r = ((˜xd − xp )2 + (˜yd − yp )2 )0.5

(1.4)

The undistorted coordinates are then {˜x = (˜xd − xp )δ + xp , y˜ = (˜yd − yp )δ + yp }

(1.5)

Camera calibration is needed in a very diverse range of applications and so there is wealth of reference material available [16,17]. For our purposes, we distinguish between two types or stages of calibration: linear and nonlinear. 1. Linear techniques use a least-squares type method (e.g., SVD) to compute a transformation matrix between 3D points and their 2D projections. Since the linear techniques do not include any lens distortion model, they are quick and simple to calculate.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 13 — #13

14

Autonomous Mobile Robots

2. Nonlinear optimization techniques account for lens distortion in the camera model through iterative minimization of a determined function. The minimizing function is usually the distance between the image points and modeled projections. In guidance applications, it is common to adopt a two-step technique: use a linear optimization to compute some of the parameters and, as a second step, use nonlinear iteration to refine, and compute the rest. Since the result from the linear optimization is used for the nonlinear iteration, the iteration number is reduced and the convergence of the optimization is guaranteed [18–20]. Salvi [17] showed that two-step techniques yield the best result in terms of calibration accuracy. Calibration should not be a daunting prospect because many software tools are freely available [21,22]. Much of the literature originated from photogrammetry where the requirements are much higher than those in autonomous navigation. It must be remembered that the effects of some parameters, such as image skew or the deviation of the principal point, are insignificant in comparison to other uncertainties and image noise in field robotics applications. Generally speaking, lens distortion modeling using a radial model is sufficient to guarantee high accuracy, while more complicated models may not offer much improvement. A pragmatic approach is to carry out much of the calibration off-line in a controlled setting and to fix (or constrain) certain parameters. During use, only a limited set of the camera parameters need be adjusted in a calibration routine. Caution must be employed when calibrating systems in situ because the information from the calibration routine must be sufficient for the degrees of freedom of the model. If not, some parameters will be confounded or wander in response to noise and, later, will give unpredictable results. A common problem encountered in field applications is attempting a complete calibration off essentially planar data without sufficient and general motion of the camera between images. An in situ calibration adjustment was adopted for the calibration of the IR camera used to take the images of Figure 1.1. The lens distortion effects were severe but were suitably approximated and corrected by a two-coefficient radial distortion model, in which the coefficients (k1 , k2 ) were measured off-line. The skew was set to zero; the principal point and aspect ratio were fixed in the calibration matrix. The focal length varied with focus adjustment but a default value (focused at infinity) was measured. Of the extrinsic parameters, only the tilt of the camera was an unknown in its application: the other five were set by the rigid mounting fixtures. Once mounted on the vehicle, the tilt was estimated from the image of the horizon. This gave an estimate of the camera calibration which was then improved given extra data. For example, four known points are sufficient to calculate the homographic mapping from ground plane to the image. However, a customized calibration routine was used that enforced the

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 14 — #14

Visual Guidance for Autonomous Vehicles

15

constraints and the physical degrees of freedom of the camera, yet was stable enough to work from data on the ground plane alone. As a final note on calibration: any routine should also provide quantified estimates of the uncertainty of the parameters determined.

1.3 VISUAL GUIDANCE SYSTEMS 1.3.1 Architecture The modules of a working visual guidance system (VGS) are presented in Figure 1.2. So far, we have described the key sensors and sensor models. Before delving into task-specific processes, we need to clarify the role of VGS within the autonomous vehicle system architecture. Essentially, its role is to capture raw sensory data and convert it into model representations of the environment and the vehicle’s state relative to it.

1.3.2 World Model Representation A world model is a hierarchical representation that combines a variety of sensed inputs and a priori information [23]. The resolution and scope at each level are designed to minimize computational resource requirements and to support planning functions for that level of the control hierarchy. The sensory processing system that populates the world model fuses inputs from multiple sensors and extracts feature information, such as terrain elevation, cover, road edges, and obstacles. Feature information from digital maps, such as road networks, elevation, and hydrology, can also be incorporated into this rich world model. The various features are maintained in different layers that are registered together to provide maximum flexibility in generation of vehicle plans depending on mission requirements. The world model includes occupancy grids and symbolic object representations at each level of the hierarchy. Information at different hierarchical levels has different spatial and temporal resolution. The details of a world model are as follows: Low resolution obstacle map and elevation map. The obstacle map consists of a 2D array of cells [24]. Each cell of the map represents one of the following situations: traversable, obstacle (positive and negative), undefined (such as blind spots), potential hazard, and so forth. In addition, high-level terrain classification results can also be incorporated in the map (long grass or small bushes, steps, and slopes). The elevation contains averaged terrain heights. Mid-resolution terrain feature map. The features used are of two types, smooth regions and sharp discontinuities [25]. A priori information. This includes multiple resolution satellite maps and other known information about the terrain.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 15 — #15

16

Color camera

Modeling and calibration

Task-specific processing

Road model Lane Model Vehicle model Terrain model

Color segmentation Landmark detection Target tracking Terrain classification

Color calibration Stereo calibration Vehicle to world coordinates

Obstacle detection 3D target tracking Terrain analysis

Stereo camera Laser range finder

FIGURE 1.2

Architecture of the VGS.

© 2006 by Taylor & Francis Group, LLC

Sensor fusion

Obstacle map fusion Terrain cluster fusion Road map and obstacle map fusion

World mapping

Obstacle map Elevation map Road map Lead vehicle orientation and speed Feature map

Autonomous Mobile Robots

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 16 — #16

Sensor

Visual Guidance for Autonomous Vehicles

17

Model update mechanism. As the vehicle moves, new sensed data inputs can either replace the historical ones, or a map-updating algorithm can be activated. We will see real examples of occupancy grids in Section 1.5.3 and Section 1.3.6 (Figure 1.8 and Figure 1.9).

1.3.3 Physical Limitations We now examine the performance criteria for visual perception hardware with regards to the classes of UGVs. Before we even consider algorithms, the physical realities of the sensing tasks are quite daunting. The implications must be understood and we will demonstrate with a simple analysis. A wide FOV is desirable so that there is a view of the road in front of the vehicle at close range. The combination of lens focal length (f ) and image sensor dimensions (H, V) determine the FOV and resolution. For example, a 1/2" sensor has image dimensions (H = 6.4 mm, V = 4.8 mm). The angle of view (horizontally) is approximated by θH = 2 arctan

H 2f

(1.6)

and it is easily calculated that a focal length of 5 mm will equate to an angle of view of approximately 65◦ with a sensor of this size. It is also useful to quote a value for the angular resolution; for example, the number of pixels per degree. With an output of 640 × 480 pixels, the resolution for this example is approximately 10 pixels per degree (or 1.75 mrad/pixel). Now consider the scenario of a UGV progressing along a straight flat road and that it has to avoid obstacles of width 0.5 m or greater. We calculate the pixel size of the obstacle, at various distances ahead, for a wide FOV and a narrow FOV, and also calculate the time it will take the vehicle to reach the obstacle. This is summarized in Table 1.2.

TABLE 1.2 Comparison of Obstacle Image Size for Two Fields-ofView and Various Distances to the Object

Distance, d (m)

Obstacle size (pixel)

Time to cover distance (sec)

FOV 60◦

FOV 10◦

120 kph

60 kph

113 45 18 3

0.24 0.6 1.5 9

0.48 1.2 3 18

8 20 50 300

35 14 5.6 0.9

20 kph 1.44 3.6 9 54

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 17 — #17

18

Autonomous Mobile Robots d

h

w

FIGURE 1.3 The ability of a sensor to image a negative obstacle is affected by the sensor’s height, resolution, and the size of the obstacle. It is very difficult to detect holes until the vehicle is within 10 m.

It can be observed from Table 1.2 that: 1. The higher the driving speed, the further the camera lookahead distance should be to give sufficient time for evasive action. For example, if the system computation time is 0.2 sec and the mechanical latency is 0.5 sec, a rough guideline is that at least 50 m warning is required when driving at 60 kph. 2. At longer lookahead distances, there are fewer obstacle pixels in the image — we would like to see at least ten pixels to be confident of detecting the obstacle. A narrower FOV is required so that the obstacle can be seen. A more difficult problem is posed by the concept of a negative obstacle: a hole, trench, or water hazard. It is clear from simple geometry and Figure 1.3 that detection of trenches from imaging or range sensing is difficult. A trench is detected as a discontinuity in range data or the disparity map. In effect we only view the projection of a small section of the rear wall of the trench: that is, the zone bounded by the rays incident with the forward and rear edges. We conclude from Table 1.3 that with a typical camera mounting height of 2.5 m, a trench of width 1 m will not be reliably detected at a distance of 15 m, assuming a minimum of 10 pixels are required for negative obstacle detection. This distance is barely enough for a vehicle to drive safely at 20 kph. The situation is improved by raising the camera; at a height of 4 m, the ditch will be detected at a distance of 15 m. Alternatively, we can select a narrow FOV lens. For example, a stereo vision system with FOV (15◦ H × 10◦ V ) is able to

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 18 — #18

Visual Guidance for Autonomous Vehicles

19

TABLE 1.3 Influence of Camera Height on Visibility of Negative Obstacles Visibility of negative obstacle (pixels) trench width w = 1 m Distance, d (m)

Camera height h = 2.5 m

8 15 25

21 (0.31 m) 6.8 (0.17 m) 2.5 (0.1 m)

Camera height h = 4 m 35 (0.5 m) 11 (0.27 m) 4 (0.16 m)

cover a width of 13 m at distance 25 m and possibly detect a ditch {w = 1 m, h = 4 m} by viewing 8 pixels of the ditch. There are several options for improving the chances of detecting an obstacle: Raising the camera. This is not always an option for practical and operational reasons; for example, it makes the vehicle easier to detect by the enemy. Increasing focal length. This has a direct effect but is offset by problems with exaggerated image motion and blurring. This becomes an important consideration when moving over a rough terrain. Increased resolution. Higher-resolution sensors are available but they will not help if a sharp image cannot be formed by the optics, or if there is image blur. The trade-off between resolution and FOV is avoided (at extra cost and complexity) by having multiple sensors. Figure 1.4 illustrates the different fields-of-view and ranges of the sensors on the VGS. Dickmanns [26,27], uses a mixed focal system comprising two wide-angle cameras with divergent axes, giving a wide FOV (100◦ ). A high-resolution three-chip color camera with greater focal length is placed between the other cameras for detecting objects at distance. The overlapping region of the cameras’ views give a region of trinocular stereo.

1.3.4 Road and Vehicle Following 1.3.4.1 State-of-the-art Extensive work has been carried out on road following systems in the late 1980s and throughout the 1990s; for example, within the PROMETHEUS Programme which ran from 1987 until 1994. Dickmanns [28] provides a comprehensive review of the development of machine vision for road vehicles. One of the key tasks is lane detection, in which road markings are used to monitor the position

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 19 — #19

20

Autonomous Mobile Robots

3D ladar Stereo imaging

mm Radar

20 m

80 m

200 m

2D ladar

FIGURE 1.4 Different subsystems of the VGS provide coverage over different fieldof-view and range. There is a compromise between FOV and angular resolution. The rectangle extending to 20 m is the occupancy grid on which several sensory outputs are fused.

of the vehicle relative to the road: either for driver assistance/warning or for autonomous lateral control. Lane detection is therefore a relatively mature technology; a number of impressive demonstrations have taken place [29], and some systems have achieved commercial realization such as Autovue and AssistWare. There are, therefore, numerous sources of reference where the reader can find details on image processing algorithms and details of practical implementation. Good places to start are at the PATH project archives at UCLA, the final report of Chauffeur II programme [30], or the work of Broggi on the Argo project [29].

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 20 — #20

Visual Guidance for Autonomous Vehicles

21

The Chauffeur II demonstration features large trucks driving in convoy on a highway. The lead vehicle is driven manually and other trucks equipped with the system can join the convoy and enter an automatic mode. The system incorporates lane tracking (lateral control) and maintaining a safe distance to the vehicle in front (longitudinal control). This is known as a “virtual tow-bar” or “platooning.” The Chauffeur II demonstration is highly structured in the sense that it was implemented on specific truck models and featured inter-vehicle communication. Active IR patterns are placed on the rear of the vehicles to aid detection, and radar is also used. The PATH demonstration (UCLA, USA) used stereo vision and ladar. The vision system tracks features on a car in front and estimates the range of an arbitrary car from passive stereo disparity. The ladar system provides assistance by guiding the search space for the vehicle in front and increasing overall robustness of the vision system. This is a difficult stereo problem because the disparity of features on the rear of car is small when viewed from a safe driving separation. Recently much of the research work in this area has concentrated on the problems of driving in urban or cluttered environments. Here, there are the complex problems of dealing with road junctions, traffic signs, and pedestrians. 1.3.4.2 A road camera model Road- and lane-following algorithms depend on road models [29]. These models have to make assumptions such as: the surface is flat; road edges or markings are parallel; and the like. We will examine the camera road geometry because, with caution, it can be adapted and applied to less-structured problems. For simplicity and without loss of generality, we assume that the road lies in the plane Z = 0. From Equation 1.1, setting all Z coordinates of X to zero is equivalent to striking out the third column of the projection matrix P in Equation 1.2. There is a homographic correspondence between the points of the road plane and the points of the image plane which can be represented by a 3 × 3 matrix transformation. This homography is part of the general linear group GL3 and as such inherits many useful properties of this group. The projection Equation 1.1 becomes x = HX: H ∈ R3×3

(1.7)

As a member of the group, a transformation H must2 have an inverse, so there will also be one-to-one mapping of image points (lines) to points (lines) on the road plane. The elements of H are easily determined (calibration) by finding at least four point correspondences in general position on 2 The exception to this is when the road plane passes through the camera center, in which case H

is singular and noninvertible (but in this case the road would project to a single image line and the viewpoint would not be of much use).

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 21 — #21

22

Autonomous Mobile Robots

FIGURE 1.5 The imaging of planar road surface is represented by a one-to-one invertible mapping. A rectangular search region projects to a trapezoidal search region in the image.

the planes.3 The homography can be expressed in any valid representation of the projective space: that is, we can change the basis to match the camera coordinate system. This means that the road does not have to be the plane Z = 0 but can be an arbitrary plane in 3D; the environment can be modeled as a set of discrete planes i each with a homography Hi that maps it to the image plane. In practice we use the homography to project a search region onto the image; a rectangular search space on the road model becomes a trapezoid on the image (Figure 1.5). The image is segmented, within this region, into road and nonroad areas. The results are then projected onto the occupancy grid for fusion with other sensors. Care must be taken because 3D obstacles within the scene may become segmented in the image as driveable surfaces and because they are “off the plane,” their projections on the occupancy grid will be very misleading. Figure 1.6 illustrates this and some other important points regarding this use of vision and projections to and from the road surface. Much information within the scene is ignored; the occupancy gird will extend to about 20 m in front of the vehicle but perspective effects such as vanishing points can tell us a lot about relative direction, or be used to anticipate events ahead. The figure also illustrates that, due to the strong perspective, the uncertainty on the occupancy grid will increase rapidly as the distance from the vehicle increases. (This is shown in the figure as the regular spaced [2 m] lane markings on the road rapidly converge to a single pixel in the image.) Both of these considerations suggest that an occupancy grid is convenient for fusing data but 3 Four points give an exact solution; more than four can reduce the effects of noise using least squares; known parameters of the projection can be incorporated in a nonlinear technique. When estimating the coefficients of a homography, principles of calibration as discussed in Section 4.2.2.2 apply. Further details and algorithms are available in Reference 13.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 22 — #22

Visual Guidance for Autonomous Vehicles 2D image

0

120

2D projection to ground plane

100

Distance in front — Y(m)

100

200

v (pixels)

23

300

400

80

60

40

20 500

0

200

400

600

0 –50

u (pixels)

0

50

Vehicle X (m)

FIGURE 1.6 The image on the left is of a road scene and exhibits strong perspective which in turn results in large differences in the uncertainty of reprojected measurements. The figure on the right was created by projecting the lower 300 pixels of the image onto a model of the ground plane. The small box (20 × 20 m2 ) represents the extent of a typical occupancy grid used in sensor fusion.

transformation to a metric framework may not be the best way to represent visual information.

1.3.5 Obstacle Detection 1.3.5.1 Obstacle detection using range data The ability to detect and avoid obstacles is a prerequisite for the success of the UGV. The purpose of obstacle detection is to extract areas that cannot or should not be traversed by the UGV. Rocks, fences, trees, and steep upward slopes are some typical examples. The techniques used in the detection of obstacles may vary according to the definition of “obstacle.” If “obstacle” means a vehicle or a human being, then the detection can be based on a search for specific patterns, possibly supported by feature matching. For unstructured terrain, a more general

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 23 — #23

24

Autonomous Mobile Robots

definition of obstacle is any object that can obstruct the vehicle’s driving path or, in other words, anything rising out significantly from the road surface. Many approaches for extracting obstacles from range images have been proposed. Most approaches use either a global or a local reference plane to detect positive (above the reference plane) or negative (below the reference plane) obstacles. It is also possible to use salient points detected by an elevation differential method to identify obstacle regions [31]. The fastest of obstacle detection algorithms, range differencing, simply subtract the range image of an actual scene from the expected range image of a horizontal plane (global reference plane). While rapid, this technique is not very robust, since mild slopes will result in false indications of obstacles. So far the most frequently used and most reliable solutions are based on comparison of 3D data with local reference planes. Thorpe et al. [22] analyzed scanning laser range data and constructed a surface property map represented in a Cartesian coordinate system viewed from above, which yielded the surface type of each point and its geometric parameters for segmentation of the scene map into traversable and obstacle regions. The procedure includes the following. Preprocessing. The input from a 2D ladar may contain unreliable range data resulting from surfaces such as water or glossy pigment, as well as the mixed points at the edge of an object. Filtering is needed to remove these undesirable jumps in range. After that, the range data are transformed from angular to Cartesian (x-y-z) coordinates. Feature extraction and clustering. Surface normals are calculated from x-y-z points. Normals are clustered into patches with similar normal orientations. Region growth is used to expand the patches until the fitting error is larger than a given threshold. The smoothness of a patch is evaluated by fitting a surface (plane or quadric). Defect detection. Flat, traversable surfaces will have vertical surface normals. Obstacles will have surface patches with normals pointed in other directions. Defect analysis. A simple obstacle map is not sufficient for obstacle analysis. For greater accuracy, a sequence of images corresponding to overlapping terrain is combined in an extended obstacle map. The analysis software can also incorporate color or curvature information into the obstacle map. Extended obstacle map output. The obstacle map with a header (indicating map size, resolution, etc.) and a square, 2D array of cells (indicating traversability) are generated for the planner. 1.3.5.2 Stereo vision Humans exploit various physiological and psychological depth cues. Stereo cameras are built to mimic one of the ways in which the human visual system

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 24 — #24

Visual Guidance for Autonomous Vehicles

25

X X = [x y z]T x1

e1

l1

l2 e1

x2 O2

e2

xL

Z

xR

OL B

OR

e2

f

FIGURE 1.7 Epipolar geometry is valid for general positions of two views. The figure on the left illustrates the epipolar lines for two frames (1 and 2). However, if the optical axes are parallel and the camera parameters are similar, stereo matching or the search for corresponding features is much easier. The figure on the right illustrates the horizontal and collinear epipolar lines in a left–right configuration with fixed baseline B.

(HVS) works to obtain depth information [32]. In a standard configuration, two cameras are bound together with a certain displacement (Figure 1.7). This distance between the two camera centers is called the baseline B. In stereo vision, the disparity measurement is the difference in the positions of two corresponding points in the left and right images. In the standard configuration, the two camera coordinate systems are related simply by the lateral displacement B (XR = XL + B). As the cameras are usually “identical” (fL = fR = f ) and aligned such that (ZL = ZR = Z) the epipolar geometry and projection equation (x = f X/Z) enable depth Z to be related to disparity d: d = xR − xL = f

XL B XL + B −f =f Z Z Z

(1.8)

where f is the focal length of the cameras. Since B and F are constants, the depth z can be calculated when d is known from stereo matching (Z = fB/d). 1.3.5.2.1 Rectification As shown in Figure 1.7, for a pair of images, each point in the “left” image is restricted to lie on a given line in the “right” image, the epipolar line — and vice versa. This is called the epipolar constraint. In standard configurations the epipolar lines are parallel to image scan lines, and this is exploited in many algorithms for stereo analysis. If valid, it enables the search for corresponding image features to be confined to one dimension and, hence, simplified. Stereo rectification is a process that transforms the epipolar lines so that they are collinear, and both parallel to the scan line. The idea behind rectification [33] is to define two new perspective matrices which preserve the optical centers but with image planes parallel to the baseline.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 25 — #25

26

Autonomous Mobile Robots

1.3.5.2.2 Multi-baseline stereo vision Two main challenges facing a stereo vision system are: mismatch (e.g., points in the left image match the wrong points in the right image) and disparity accuracy. To address these issues, multiple (more than two) cameras can be used. Nakamura et al. [34] used an array of cameras to resolve occlusion by introducing occlusion masks which represent occlusion patterns in a real scene. Zitnick and Webb [35] introduced a system of four cameras that are horizontally displaced and analyze potential 3D surfaces to resolve the feature matching problem. When more than two cameras or camera locations are employed, multiple stereo pairs (e.g., cameras 1 and 2, cameras 1 and 3, cameras 2 and 3, etc.) result in multiple, usually different baselines. In the parallel configuration, each camera is a lateral displacement of the other. For a given depth, we then calculate the respective expected disparities relative to a reference camera (say, the leftmost camera) as well as the sum of match errors over all the cameras. The depth associated with a given pixel in the reference camera is taken to be the one with the lowest error. The multi-baseline approach has two distinctive advantages over the classical stereo vision [36]: • It can find a unique match even for a repeated pattern such as the cosine function. • It produces a statistically more accurate depth value. 1.3.5.2.3 General multiple views During the 1990s significant research was carried out on multiple view geometry and demonstrating that 3D reconstruction is possible using uncalibrated cameras in general positions [14]. In visual guidance, we usually have the advantage of having calibrated cameras mounted in rigid fixtures so there seems little justification for not exploiting the simplicity and speed of the algorithms described earlier. However, the fact that we can still implement 3D vision even if calibration drifts or fixtures are damaged, adds robustness to the system concept. Another advantage of more general algorithms is that they facilitate mixing visual data from quite different camera types or from images taken from arbitrary sequences in time. 1.3.5.3 Application examples In this section we present some experimental results of real-time stereo-visionbased obstacle detection for unstructured terrain. Two multi-baseline stereo vision systems (Digiclops from Pointgrey Research, 6 mm lens) were mounted at a height of 2.3 m in front and on top of the vehicle, spaced 20 cm apart. The two stereo systems were calibrated so that their outputs were referred to

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 26 — #26

Visual Guidance for Autonomous Vehicles (a)

(b)

(c)

(d)

27

FIGURE 1.8 (a) Isodisparity profile lines generated from the disparity map using a LUT method. (b) A single isodisparity line (curved line), its reference line (straight) and detected obstacle pixels. (c) Detected obstacle points. (d) Obstacle map.

the same vehicle coordinate system. A centralized triggering signal was generated for the stereo systems and other sensors to synchronize the data capturing. The stereo systems were able to generate disparity maps at a frequency of 10 frames/sec. To detect obstacles, an isodisparity profile-based obstacle detection method was introduced [37], which converted the 3D obstacle detection into 1D isodisparity profile segmentation. The system output was an obstacle map with 75 × 75 elements, each representing a 0.2 m × 0.2 m area within 15 m × 15 m in front of the vehicle. Seventy-five isodisparity profiles were generated from the disparity map using a look-up-table method (Figure 1.8a). The name isodisparity comes from the fact that points in each profile line have the same disparity value. Regardless of the size of the disparity map (usually 320 × 240 pixels), the method was able to identify 75 × 75 points from the disparity image, which exactly matched the elements of the obstacle map. By processing these 75 × 75 points using reference-line-based histogram classification, obstacle points were detected with subpixel accuracy. Figure 1.8a shows the profiles of a typical test terrain with road and bushes. Figure 1.8b shows the calculated reference lines. It is noteworthy that the reference lines

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 27 — #27

28

Autonomous Mobile Robots

form a curved surface instead of a planar surface used by other approaches. The final obstacle detection result and map are displayed in Figure 1.8c and d, respectively.

1.3.6 Sensor Fusion The most important task of a VGS is to provide accurate terrain descriptions for the path planner. The quality of terrain maps is assessed by miss rate and false alarm. Here, the miss rate refers to the occurrence frequency of missing a true obstacle while a false alarm is when the VGS classifies a traversable region as an obstacle region. Imaging a stereo vision system with a frame rate of 10 Hz will generate 3000 obstacle maps in 5 min. Even with a successful classification rate of 99.9%, the system may produce an erroneous obstacle map three times of which may cause an error in path planning. The objective of sensor fusion is to combine the results from multiple sensors, either at the raw data level or at the obstacle map level, to produce a more reliable description of the environment than any sensor individually. Some examples of sensor fusion are: N-modular redundancy fusion: Fusion of three identical radar units can tolerate the failure of one unit. Fusion of complementary sensors: Color terrain segmentation results can be used to verify 3D terrain analysis results. Fusion of competitive sensors: Although both laser and stereo vision perform obstacle detection, their obstacle maps can be fused to reduce false alarms. Synchronization of sensors: Different sensors have different resolutions and frame rates. In addition to calibrating all sensors using the same vehicle coordinates, sensors need to be synchronized both temporally and spatially before their results can be merged. Several solutions can be applied for sensor synchronization. An external trigger signal based synchronization: For sensors with external trigger capability such as IR, color, and stereo cameras, their data capturing can be synchronized by a hardware trigger signal from the control system of the UGV. For laser or ladar, which do not have such capability, the data captured at the time nearest to the trigger signal are used as outputs. In this case, no matter how fast a laser scanner can scan (usually 20 frames/sec), the fusion frame rate depends on the slowest sensor (usually stereo vision, around 10 frames/sec). A centralized time stamp for each image from each sensor: In this case sensors capture data as fast as they can. Since each sensor normally has its own CPU for data processing, a centralized control system will send out a standardized time stamp signal to all CPUs regularly (say, every 1 h) to minimize the time stamp drift.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 28 — #28

Visual Guidance for Autonomous Vehicles

29

When sensor outputs are read asynchronously, certain assumptions such as being Linear Time Invariant (LTI) [38] can be made to propagate asynchronized data to the upcoming sample time of the control system. Robl [38] showed examples of using first-order hold and third-order hold methods to predict sensor values at desired times. When different resolution sensors are to be fused at the data level (e.g., fusion of range images from ladar and stereo vision), down-sampling of sensor data with higher spatial resolution by interpolation is performed. For sensor fusion at the obstacle map level, spatial synchronization is not necessary since a unique map representation is defined for all sensors. Example: Fusion of laser and stereo obstacle maps for false alarm suppression Theoretically, pixel to pixel direct map fusion is possible if the calibration and synchronization of the geometrical constraints (e.g., rotation and translation between laser and stereo system) remain unchanged after calibration. Practically, however, this is not realistic, partially due to the fact that sensor synchronization is not guaranteed at all times: CPU loading, terrain differences, and network traffic for the map output all affect the synchronization. Feature-based co-registration sensor fusion, alternatively, addresses this issue by computing the best-fit pose of the obstacle map features relative to multiple sensors which allows refinement of sensor-to-sensor registration. In the following, we propose a localized correlation based approach for obstaclemap-level sensor fusion. Assuming the laser map Lij and stereo map Sij is to be merged to form Fij . A map element takes the value 0 for a traversable pixel, 1 for an obstacle, and anything between 0 and 1 for the certainty of the pixel to be classified as an obstacle. We formulate the correlation-based sensor fusion as

Fij =

 Lij     Sij

Sij = undefined

Lij = undefined (a1 Lij + a2 Si+m,j+n )/(a1 + a2 ) max(Corr(Lij Si+m,j+n ))     undefined Sij , Lij = undefined

m, n ∈ (1.9)

where represents a search area and {a1 , a2 } are weighting factors. Corr(L, S) is the correlation between L and S elements with window size wc :

Corr(Lij Si+m,j+n ) =

w c /2

w c /2

Li+p,j+q Si+m+p,j+n+q

(1.10)

p=−wc /2 q=−wc /2

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 29 — #29

30

Autonomous Mobile Robots

The principle behind the localized correlation sensor fusion is: instead of directly averaging Lij and Sij to get Fij , a search is performed to find the best match within a small neighborhood. The averaging of the center pixel at a matched point produces the final fusion map. In case an obstacle map only takes three values: obstacle, traversable, and undefined; the approach above can be simplified as  Lij       Sij Fij = 1     1    0

Sij = undefined Lij = undefined Lij = 1, Cso > T1 , D < T2

(1.11)

Sij = 1, Clo > T1 , D < T2 otherwise

where T1 and T2 are preset thresholds that depend on the size of the search window. In our experiments a window of size 5 × 5 pixels was found to work well. The choice of size is a compromise between noise problems with small windows and excessive boundary points with large windows. Cso and Clo are obstacle pixel counts within the comparison window wc , for Lij and Sij , respectively, D is the minimum distance between Lij and Sij in :  D = min 

w c /2



w c /2

|Si+m+p,j+n+q − Li+p,j+q | (m, n) ∈ (1.12)

p=−wc /2 q=−wc /2

Cso =

w c /2

w c /2

Si+m+p,j+n+q

(1.13)

p=−wc /2 q=−wc /2

The advantage of implementing correlation-based fusion method is twofold: it reduces false alarm rates and compensates for the inaccuracy from laser and stereo calibration/synchronization. The experimental results of using above mentioned approach for laser and stereo obstacle map fusion are shown in Figure 1.9. The geometry of 2D range and image data fusion. Integration of sensory data offers much more than a projection onto an occupancy grid. There exist multiple view constraints between image and range data analogous to those between multiple images. These constraints help to verify and disambiguate data from either source, so it is useful to examine the coordinate transformations and the physical parameters that define them. This will also provide a robust framework for selecting what data should be fused and in which geometric representation.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 30 — #30

Visual Guidance for Autonomous Vehicles

31

FIGURE 1.9 Sensor fusion of laser and stereo obstacle maps. False alarm in laser obstacle map (left image, three laser scanning lines at the top of the map), is suppressed by fusion with the stereo vision obstacle map (middle image), and a more reliable fusion result is generated (right image).

First, consider the relationship between a data point from the ladar and a world coordinate system. We can transform {r, θ} to a point X in a Cartesian space. A 3D point X will be detected by an ideal ladar if it lies in the plane Z=0 expressed in the sensor’s coordinate system. (This is neglecting the range limits, and the finite size and divergence of the laser beam). If the plane, in the world coordinate system, is denoted as L , the set of points that can be detected satisfy TL X = 0

(1.14)

Alternatively we expand the rigid transformation equation and express this as a constraint (in sensor coordinates) XL =

GW L X

GW L

=

RLW

T

0

1

(1.15)

Only the third row of G [r3i TZ ] plays any part in the planar constraint on the point {X = [X Y Z 1]T }. The roles of the parameters are then explicit: r31 X + r32 Y + r33 Z + TZ = 0

(1.16)

However, if the vehicle is moving over tough terrain there will be considerable uncertainty in the instantaneous parameters of R and T . We therefore look at a transformation between ladar data and image data without reference to any world coordinate system. Assuming there are no occlusions, X will be imaged as x on the image plane I of the camera. As X lies in a plane L , there exists

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 31 — #31

32

Autonomous Mobile Robots

a homography H (abbreviated from HIL ladar to image) that maps X to x. x = HX: H ∈ R3×3

(1.17)

This mapping is unambiguous and is parameterized by the geometry between the two sensors which is less uncertain than the geometry with reference to a world coordinate system. H can be solved from point correspondences and if required it can be decomposed into the geometric parameters relating the two planes. The reverse mapping is not unambiguous: a point x is the image of the ray passing through x and the optical center OC . We can map x (with H −1 ) to a single point p {r, θ} in the laser parameter space but there is no guarantee that the true 3D point that gave rise to x in the image came from this plane. Another consideration is that image-ladar correspondences are rarely point-to-point but line-to-point. (ladar data rarely comes from a distinct point in 3D; it is more likely to have come from a set of points such as a vertical edge or the surface of a tree.) Consider the image of the pole shown in Figure 1.10; the pre-image of this is a plane, and so the image line could be formed from an infinite set of lines (a pencil) in this plane. However, knowledge of the laser point p, constrains the 3D space line to the pencil of lines concurrent with X. Furthermore, assuming that the base of the image line corresponds to the ground plane is sufficient to define a unique space line. There are various ways to establish mappings between the two types of sensors without reliance on a priori parameters with their associated uncertainties.

OC ΠL

ΠG X

FIGURE 1.10 There is ambiguity in both ladar and imaging data. There are geometric constraints between the sets of data that will assist in disambiguation and improving reliability of both systems.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 32 — #32

Visual Guidance for Autonomous Vehicles

33

One of the key problems in processing ladar data is data association. For example, consider capturing data from a tree. The points that are detected depend on the viewpoint: that is, surface features are not pose invariant [10]. This problem becomes easier with the use of a putative model of the tree whose 2D position is determined by a centroid, which is invariant. Such a model is easier to initiate if image data provides the evidence that the data points match image features with the correct “tree-like” attributes. Once we have a model we can anticipate where to search for features to match data points and vice-versa. In this case we want to compare the real data with a model prediction but this has to be very efficient given the large amount of data and hypotheses that will occur. A typical problem is to test if a model patch will be detected by a sensor, and how many data points to expect. Range detection is equivalent to ray intersection and is more easily solved after projection into a 2D space: a cylindrical projection is sufficient and preserves the topology. To summarize, in isolation there is much ambiguity in either sensor, and exchanging information using image constraints can reduce this problem. The difficulty is how to implement this practically as the concept of “being like a tree” is more abstract than the neat formulation of raw data fusion as seen in Section 1.3.6. This lack of precise mathematical formulation and reliance on heuristic rules deters many researchers. However, recent advances and increased processing speeds have made probabilistic reasoning techniques tractable and worthy of consideration in real-time problems such as visual guidance and terrain assessment.

1.4 CHALLENGES AND SOLUTIONS The earlier sections have detailed many of the practical difficulties of visually based guidance and presented pragmatic techniques used during field demonstrations. To be realistic, autonomous vehicles represent a highly complex set of problems and current capability is still at the stage of the SAP/F “donkey” engaged in A-to-B mobility. To extend this capability, researchers need to think further along the technology road map [1] and tackle perception challenges such as: terrain mapping, detection of cover, classification of vegetation, and the like.

1.4.1 Terrain Classification Obstacle detection based only on distance information is not sufficient. Long grass or small bushes will also be detected as obstacles because of their height. However, the vehicle could easily drive through these “soft” obstacles. Alternatively, soft vegetation can cover a dangerous slope but appear as a traversable surface. To reduce unnecessary avoidance driving, detected obstacles need to

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 33 — #33

34

Autonomous Mobile Robots

be classified as “dangerous” or “not dangerous.” Color cameras can be used to perform terrain classification. Color segmentation relies on having a complete training set. As lighting changes, due to time of day or weather conditions, the appearance of grass and obstacle change as well. Although color normalization methods have been successfully applied to the indoor environment, they, to our knowledge, fail to produce reasonable results in an outdoor environment. Similarly, color segmentation can classify flat objects, such as fallen leaves, as obstacles, since their color is different from grass. If dense range measurements in a scene are available (e.g., using ladar), they can be used, not only to represent the scene geometry, but also to characterize surface types. For example, the range measured on bare soil or rocks tends to lie on a relatively smooth surface; in contrast, in the case of bushes, the range is spatially scattered. While it is possible — although by no means trivial — to design algorithms for terrain classification based on the local statistics of range data [39–41], the confidence level of a reliable classification is low. Table 1.4 lists the most frequently encountered terrain types and possible classification methods.

1.4.2 Localization and 3D Model Building from Vision Structure from motion (SFM) is the recovery of camera motion and scene structures — and in certain cases camera intrinsic parameters — from image

TABLE 1.4 Terrain Types and Methods of Classification Terrain type Vegetable Rocks Walls/fence Road (paved, gravel, dirt) Slope Ditch, hole Sand, dirt, mud, gravel Water Moving target

Sensors

Classification methods

IR/Color camera IR/Color camera Camera, stereo, laser IR/Color camera

Segmentation Segmentation Texture analysis, obstacle detection Segmentation

Stereo, ladar Stereo, ladar IR/Color camera

Elevation analysis, surface fit

Polarized camera, laser scanner Camera, stereo

Confidence level Medium Medium High Medium

Segmentation

High Low Medium

Feature detection, sensor fusion

Medium

Optical flow, obstacle detection, pattern matching

High

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 34 — #34

Visual Guidance for Autonomous Vehicles

35

sequences. It is attractive because it avoids the requirement for a priori models of the environment. The techniques are based on the constraints that exist between the multiple views of features. This is a mature area of computer vision that has attracted intensive research activity in the previous decade, prompted by the breakthroughs in multiple view geometry in the early 1990s. Much of the original work was motivated by mobile robotics but soon found more general application such as: the generation of special effects for cinema, scene recovery for virtual reality, and 3D reconstruction for architecture. Here, the theoretical drive has been inspired by the recovery of information from recorded sequences such as camcorders where the motion is general and little can be assumed regarding the camera parameters. These tasks can be accomplished off-line and the features and camera parameters from long sequences solved as a large-scale optimization in batch mode. As such, many would regard this type of SFM as a solved problem but the conditions in vehicle navigation are specific and require separate consideration: • The motion is not “general,” it may be confined to a plane, or restricted to rotations around axes normal to the plane. • Navigation is required in real-time and parameters require continuous updating from video streams as opposed to the batch operations of most SFM algorithms. • Sensory data, from sources other than the camera(s), are usually available. • Many of the camera parameters are known (approximately) beforehand. • There are often multiple moving objects in a scene. Visual guidance demands a real-time recursive SFM algorithm. Chiuso et al. [42] have impressive demonstrations of a recursive filter SFM system that works at a video frame rate of 30 Hz. However, once we start using Kalman filters to update estimates of vehicle (camera) state and feature location, some would argue that we enter the already very active realm of simultaneous localization and mapping (SLAM). The truth is that there are differences between SLAM and SFM and both have roles in visual guidance. Davison [43] has been very successful in using vision in a SLAM framework and Bosse [9] has published some promising work in indoor and outdoor navigation. The key to both of these is that they tackle a fundamental problem of using vision in SLAM: the relatively narrow FOV and recognizing features when revisiting a location. Davison used active vision in Reference 4 and wide-angle lenses in Reference 43 to fixate on a sparse set of dominant features whereas Bosse used a catadioptric camera and exploited vanishing points. SLAM often works well with 2D ladar by collecting and maintaining estimates of a sparse set of features with reference to world coordinate system. A problem with SFM occurs when

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 35 — #35

36

Autonomous Mobile Robots

features used for reference pass out of the FOV: in recursive mode, there is no guarantee at initiation that features will persist. Errors (drift) are introduced when the reference features are changed and the consequence is that a robot will have difficulty in returning home or knowing that it is revisiting a location. Chiuso has a scheme to reduce this problem but drift is still inevitable. On the other hand, SLAM has to rely on sparse data because it needs to maintain a full covariance matrix which will soon become computationally expensive if the number of data points is not restricted. It can be difficult to associate outdoor data when it is sparse. The two techniques offer different benefits and a possible complementary role. SLAM is able to maintain a sparse map on a large scale for navigation but locally does not help much with terrain classification. SFM is useful for building a dense model of the immediate surroundings, useful for obstacle avoidance, path planning, and situation awareness. The availability of a 3D model (with texture and color) created by SFM will be beneficial for validation of the sensory data used in a SLAM framework: for example, associating an object type with range data; providing color (hue) as an additional state; and so on.

1.5 CONCLUSION We have presented the essentials of a practical VGS and provided details on its sensors and capabilities such as road following, obstacle detection, and sensor fusion. Worldwide, there have been many impressive demonstrations of visual guidance and certain technologies are so mature that they are available commercially. This chapter started with a road map for UGVs and we have shown that the research community is still struggling to achieve A-to-B mobility in tasks within large-scale environments. This is because navigating through open terrain is a highly complex problem with many unknowns. Information from the immediate surroundings is required to determine traversable surfaces among the many potential hazards. Vision has a role in the creation of terrain maps but we have shown that practically this is still difficult due to the physical limitations of available sensor technology. We anticipate technological advances that will enable the acquisition of high-resolution 3D data at fast frame rates. Acquiring large amounts of data is not a complete solution. We argue that we do not make proper use of the information already available in 2D images, and that there is potential for exploiting algorithms such as SFM and visionbased SLAM. Another problem is finding alternative ways of representing the environment that are more natural for navigation; or how to extract knowledge from images and use this (state) information within algorithms. We have made efforts to highlight problems and limitations. The task is complex and practical understanding is essential. The only way to make real

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 36 — #36

Visual Guidance for Autonomous Vehicles

37

progress along the road map is through testing sensors, systems, and algorithms in the field; and then seeing what can survive the challenges presented.

ACKNOWLEDGMENTS The authors would like to acknowledge the support of A∗ STAR and DSTA (Singapore) in funding project activities that have contributed to the findings presented in this chapter.

REFERENCES 1. Committee on Army Unmanned Ground Vehicle Technology National Research Council. Technology Development for Army Unmanned Ground Vehicles. National Academy Press, Washington, 2002. 2. J. Kuamgai. Sand trap. IEEE Spectrum, 41: 34–40, 2004. 3. T. W. Fong, C. Thorpe, and C. Baur. Advanced interfaces for vehicle teleoperation: collaborative control, sensor fusion displays, and remote driving tools. Autonomous Robots, 11: 77–85, 2001. 4. A. J. Davison. Mobile Robot Navigation Using Active Vision. PhD thesis, Department of Engineering Science, University of Oxford, Oxford, UK, February 1998. 5. R. C. Gonzalez and R. E. Woods. Digital Image Processing. Addison-Wesley, Reading, MA, 2nd edition, 1992. 6. N. Zeng and J. D. Crisman. Categorical color projection for robot road following. In Proceedings of 12th International Conference on Robotics and Automation, pp. 1080–1085, 1995. 7. J. D. Crisman and C. E. Thorpe. Scarf: a color vision system that tracks roads and intersections. IEEE Transactions on Robotics and Automations, 9: 49–58, 1993. 8. C. Geyer and K. Daniilidis. A unifying theory for central panoramic systems and practical applications. In ECCV (2), pp. 445–461. Springer-Verlag, Heidelberg, 2000. 9. M. Bosse, R. J. Rikoski, J. J. Leonard, and S. J. Teller. Vanishing points and three-dimensional lines from omni-directional video. The Visual Computer, 19: 417–430, 2003. 10. F. Tang, M. D. Adams, J. Ibanez-Guzman, and W. S. Wijesoma. Pose invariant, robust feature extraction from range data with a modified scale space approach. In Proceedings of IEEE International Conference on Robotics and Automation, New Orleans, LA, USA, April 2004. 11. T. C. Ng and J. C. Tan. Development of a 3D ladar system for autonomous navigation. In Proceedings of IEEE Conference on Robotics, Automation and Mechatronics (RAM 04), Singapore, pp. 792–797, 1–3 December 2004. 12. W. C. Stone, M. Juberts, N. Dagalakis, J. Stone, and J. Gorman. Performance analysis of next generataion ladar for manufacturing, construction and mobility.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 37 — #37

38

Autonomous Mobile Robots

13.

14. 15. 16. 17.

18.

19.

20.

21. 22. 23.

24.

25.

26.

27.

28.

Technical Report NISTIR 7117, NIST, Building and Fire Research Laboratory and Manufacturing Engineering Laboratory, Maryland, USA, May 2004. R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, London, New York, ISBN: 0521623049, 2000. O. Faugeras and Q.-T. Luong. The Geometry of Multiple Images. MIT, Cambridge, MA, 1999. Y. Ma, S. Soatto, J. Kosecka, and S. Sastry. An Invitation to 3D Vision: From Images to Geometric Models. Springer-Verlag, Heidelberg, 2003. C. C. Slama. Manual of Photogrammetry. 4th edition, American Society of Photogrammetry, Falls Church, VA, 1980. J. Salvi, X. Armangu, and J. Batlle. A comparative review of camera calibrating methods with accuracy evaluation. Pattern Recognition, 35: 1617–1635, 2002. R. Y. Tsai. An efficient and accurate camera calibration technique for 3D machine vision. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami Beach, Florida, pp. 364–374, 1986. J. Weng, P. Cohen, and M. Herniou. Camera calibration with distortion models and accuracy evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14: 965–980, 1992. J. Heikkila and O. Silven. A four-step camera calibration procedure with implicit image correction. In Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR ’97), San Juan, Puerto Rica, pp. 1106–1112, IEEE Computer Society, Washington, June 1997. J.-Y. Bouguet. A camera calibration toolbox for matlab. http://www.vision.caltech.edu/bouguetj/calib_doc/. Intel. The open cv library. http://sourceforge.net/projects/opencvlibrary/. H. Tsai, S. B. Balakirsky, E. Messina, T. Chang, and M. Schneier. A hierarchical world model for an autonomous scout vehicle. Proceedings of SPIE, 4715: 343–354, July 2002. M. Hebert, C. Thorpe, and A. Stentz. Intelligent Unmanned Ground Vehicles: Autonomous Navigation Research at Carnegie Mellon. Kluwer Academic Publishers, Dordrecht, 1997. M. Hebert, T. Kanade, and I. Kweon. 3D vision techniques for autonomous vehicles CMU-RI-TR-88-12. Technical Report, Robotics Institute, Carnegie Mellon University, August 1988. E. D. Dickmanns. An advanced vision system for ground vehicles. In First International Workshop on In-Vehicle Cognitive Computer Vision Systems (IVCCVS), Graz, Austria, pp. 1–12, 3 April 2003. R. Gregor, M. Lutzeler, M. Pellkofer, K.-H. Siedersberger, and E. D. Dickmanns. EMS-vision: a perceptual system for autonomous vehicles. IEEE Transactions on Intelligent Transportation Systems, 3: 48–59, 2002. E. D. Dickmanns. The development of machine vision for road vehicles in the last decade. In Procceedings of IEEE Intelligent Vehicle Symposium, IV2002, Verailles, France, vol. 1, pp. 268–281, 17–21 June 2002.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 38 — #38

Visual Guidance for Autonomous Vehicles

39

29. A. Broggi, M. Bertozzi, A. Fascioli, and G. Conte. Automatic Vehicle Guidance: The Experience of the ARGO Autonomous Vehicle. World Scientific, 1999. 30. Chauffeur II Final Report. Technical Report IST-1999-10048 D24, The Promote-Chauffeur II Consortium, July 2003. 31. T. Chang, H. Tsai, S. Legowik, and M. N. Abrams. Concealment and obstacle detection for autonomous driving. In M. H. Hamza (ed.), Proceedings of IASTED Conference, Robotics and Applications 99, pp. 147–152. ACTA Press, Santa Barbara, CA, 28–30 October 1999. 32. T. Okoshi. Three-Dimensional Imaging Techniques. Academic Press, New York, 1976. 33. N. Ayache. Artificial Vision for Mobile Robots: Stereo Vision and Multisensory Perception. The MIT Press, Cambridge, MA, 1991. 34. Y. Nakamura, K. Satoh, T. Matsuura, and Y. Ohta. Occlusion detectable stereoocclusion patterns in camera matrix. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, San Francisco, CA, pp. 371–378, 18–20 June 1996. 35. C. L. Zitnick and J. A. Webb. Multi-baseline stereo using surface extraction. Technical Report, School of Computer Science, Carnegie Melllon University, November 1996. CMU-CS-96-196. 36. M. Okutomi and T. Kanade. A multiple-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15: 353–363, 1993. 37. J. Xu, H. Wang, J. Ibanez-Guzman, T. C. Ng, J. Shen, and C. W. Chan. Isodisparity profile processing for real-time 3D obstacle identification. In Proceedings of IEEE International Conference on Intelligent Transportation Systems (ITS) Shanghai, China, pp. 288–292, October 2003. 38. C. Robl and G. Faerber. System architecture for synchronizing, signal level fusing, simulating and implementing sensors. In Proceedings of the 2000 IEEE International Conference on Robotics and Automation, San Francisco, CA, pp. 1639–1644, April 2000. 39. J. Huang, A. B. Lee, and D. Mumford. Statistics of range images. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 324–331, 2000. 40. M. Hebert, N. Vandapel, S. Keller, and R. R. Donamukkala. Evaluation and comparison of terrain classification techniques from ladar data for autonomous navigation. In Twentythird Army Science Conference, Orlando, FL, USA, December 2002. 41. N. Vandapel, D. F. Huber, A. Kapuria, and M. Hebert. Natural terrain classification using 3D ladar data. In Proceedings of IEEE International Conference on Robotics and Automation, New Orleans, LA, USA, Vol. 5, pp. 5117–5122, 26 April–1 May 2004. 42. A. Chiuso, P. Favaro, H. L. Jin, and S. Soatto. Structure from motion causally integrated over time. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24: 523–535, 2002. 43. A. J. Davison, Y. González Cid, and N. Kita. Real-time 3D SLAM with wide-angle vision. In 5th IFAC/EURON Symposium on Intelligent Autonomous Vehicles, Lisboa, Portugal. IFAC, Elsevier Science, 5–7 July 2004.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 39 — #39

40

Autonomous Mobile Robots

BIOGRAPHIES Andrew Shacklock has 20 years of experience in mechatronics and sensor guided robotic systems. He graduated with a B.Sc. from the University of Newcastle Upon Tyne in 1985 and a Ph.D. from the University of Bristol in 1994. He is now a research scientist at the Singapore Institute of Manufacturing Technology. His main research interest is in machine perception and sensor fusion, in particular for visual navigation. Jian Xu received the bachelor of engineering degree and master engineering degree in electrical engineering from Shanghai Jiao Tong University, China in 1982 and 1984, respectively. He received his Doctor of Engineering from Erlangen-Nuremberg University, Germany in 1992. He is currently a research scientist at the Singapore Institute of Manufacturing Technology, Singapore. His research interests include 3D machine vision using photogrammetry and stereo vision, camera calibration, sensor fusion, subpixeling image processing, and visual guidance system for autonomous vehicle. Han Wang is currently an associate professor at Nanyang Technological University, and senior member of IEEE. His research interests include computer vision and AGV navigation. He received his bachelor of engineering degree from Northeast Heavy Machinery Institute in 1982 and Ph.D. from Leeds University in 1989. He has been a research scientist at CMU and research officer at Oxford University. He spent his sabbatical in 1999 in Melbourne University.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c001” — 2006/3/31 — 16:42 — page 40 — #40

2

Millimeter Wave RADAR Power-Range Spectra Interpretation for Multiple Feature Detection Martin Adams and Ebi Jose

CONTENTS 2.1 2.2 2.3

2.4

2.5

2.6

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . FMCW RADAR Operation and Range Noise . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Noise in FMCW Receivers and Its Effect on Range Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RADAR Range Spectra Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 RADAR Range Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Interpretation of RADAR Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2.1 Thermal noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2.2 Phase noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Noise Analysis during Target Absence and Presence . . . . . . . 2.4.3.1 Power-noise estimation in target absence . . . . . . . . . 2.4.3.2 Power-noise estimation in target presence . . . . . . . . 2.4.4 Initial Range Spectra Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Constant False Alarm Rate Processor for True Target Range Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 The Effect of the High Pass Filter on CFAR . . . . . . . . . . . . . . . . . 2.5.1.1 Missed detections with CFAR . . . . . . . . . . . . . . . . . . . . . 2.5.1.2 False alarms with CFAR . . . . . . . . . . . . . . . . . . . . . . . . . . . Target Presence Probability Estimation for True Target Range Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Target Presence Probability Results . . . . . . . . . . . . . . . . . . . . . . . . . .

42 44 45 47 48 50 50 51 51 53 53 57 60 64 65 66 67 68 72 41

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 41 — #1

42

Autonomous Mobile Robots

2.6.2

Merits of the Proposed Algorithm over Other Feature Extraction Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Multiple Line-of-Sight Targets — RADAR Penetration. . . . . . . . . . . . . 2.8 RADAR-Based Augmented State Vector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.1 Process Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.2 Observation (Measurement) Model . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.2.1 Predicted power observation formulation . . . . . . . . . 2.9 Multi-Target Range Bin Prediction — Results . . . . . . . . . . . . . . . . . . . . . . . 2.10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Biographies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74 76 79 80 84 85 89 93 94 95 97

2.1 INTRODUCTION Current research in autonomous robot navigation [1,2] focuses on mining, planetary-exploration, fire emergencies, battlefield operations, as well as on agricultural applications. Millimeter wave (MMW) RADAR provides consistent and accurate range measurements for the environmental imaging required to navigate in dusty, foggy, and poorly illuminated environments [3]. MMW RADAR signals can provide information of certain distributed targets that appear in a single line-of-sight observation. This work is conducted with a 77-GHz frequency modulated continuous wave (FMCW) RADAR which operates in the MMW region of the electromagnetic spectrum [4,5]. For localization and map building, it is necessary to predict the target locations accurately given a prediction of the vehicle/RADAR location [6,7]. Therefore, the first contribution of this chapter offers a method for predicting the power–range spectra (or range bins) using the RADAR range equation and knowledge of the noise distributions in the RADAR. The predicted range bins are to be used ultimately as predicted observations within a mobile robot RADAR-based navigation formulation. The actual observations take the form of received power/range readings from the RADAR. The second contribution of this chapter is an algorithm which makes optimal estimates of the range to multiple targets down-range, for each range spectra based on received signal-to-noise power. We refer to this as feature detection based on target presence probability. Results are shown which compare probability-based feature detection with other feature extraction techniques such as constant threshold [9] on raw data and constant false alarm rate (CFAR)

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 42 — #2

Millimeter Wave RADAR Power-Range Spectra Interpretation

43

techniques [24]. The results show the merit of the proposed algorithm which can detect features in typically cluttered outdoor environments with a higher success rate compared to other feature detection techniques. This work is a step toward robust outdoor robot navigation with MMW-RADAR-based continuous power spectra. Millimeter wave RADAR can penetrate certain nonmetallic objects, meaning that multiple line-of-sight objects can sometimes be detected, a property which can be exploited in mobile robot navigation in outdoor unstructured environments. This chapter describes a new approach in predicting RADAR range bins which is essential for simultaneous localization and map building (SLAM) with MMW RADAR. The third contribution of this chapter is a SLAM formulation using an augmented state vector which includes the normalized RADAR cross sections (RCS) and absorption cross sections of features as well as the usual feature Cartesian coordinates. The term “normalized” is used as the actual RCS is incorporated into a reflectivity parameter. Normalization results, as it is assumed that the sum of this reflectivity parameter and the absorption and transmittance parameters is unity. This is carried out to provide feature-rich representations of the environment to significantly aid the data association process in SLAM. The final contribution is a predictive model of the range bins, from differing vehicle locations, for multiple line-of-sight targets. This forms a predicted power–range observation, based on estimates of the augmented SLAM state. The formulation of power returns from multiple objects down-range is derived and predicted RADAR range spectra are compared with real spectra, recorded outdoors. This prediction of power–range spectra is a step toward a full, RADAR-based SLAM framework. Section 2.2 summarizes related work, while Section 2.3 describes FMCW RADAR operation and the noise affecting the range spectra, in order to understand the noise distributions in both range and power. Section 2.4 describes how power–range spectra can be predicted (predicted observations). This utilizes the RADAR range equation and an experimental noise analysis. Section 2.5 analyzes a feature detector based on the CFAR detection method. The study also shows ways to compensate for the inaccuracies of the power–range compensating high-pass filter, contained in FMCW RADARs, and thereby improve the feature detection process. A method for estimating the true range to objects from power–range spectra is given in Section 2.6 in the form of a new robust feature detection technique based on target presence probability. Section 2.6.1 shows the merits of the target presence probability-based algorithm which can detect ground level features with greater reliability than other feature detection techniques such as constant threshold on raw RADAR data and CFAR techniques. An augmented state vector is introduced in Section 2.8 where, along with the vehicle and feature positions, normalized RCS and absorption cross sections of

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 43 — #3

44

Autonomous Mobile Robots

features are added together with the RADAR losses. Finally, Section 2.9 shows full predicted range spectra and the results are compared with the measured range bins in the initial stages of a simple SLAM formulation.

2.2 RELATED WORK In recent years RADAR, for automotive purposes, has attracted a great deal of interest in shorter range ( 0 and shape parameter ξ > 0. The mean of x is µ = ψ (1 + (1/ξ )) − 15 and variance, σ 2 = ψ 2 (1+(2/ξ )−ψ 2 [ (1+(1/ξ ))]2 ), where (· · · ) is the Gamma function [23]. For scaling purposes, in this case the random variable x equals the received power Pr + 15 dB, in order to fit Equation (2.9). For a range of 10 m (Figure 2.7a), suitable parameters for an equivalent Weibull distribution, ψ and ξ are 0.0662 and 0.4146, respectively.5 At low ranges, this distribution is approximately equivalent to an exponential distribution, with mean, µ = −14.8 dB and variance σ 2 = 0.3501 dB2 . For a range of 100 m (Figure 2.7b), suitable Weibull parameters have been obtained as ψ = 26.706 and ξ = 5.333. The distribution has a mean, 5 These values are obtained using Matlab to fit Equation (2.9) to the experimentally obtained

distribution of Figure 2.7a.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 55 — #15

56

Autonomous Mobile Robots (a) 8000 7000 6000

Number

5000 4000 3000 2000 1000 0 –15

–14

–13

–12

–11

–10

–9

–8

–7

–6

–5

20

25

30

35

Power (dB)

(b) 2500

Number

2000

1500

1000

500

0 –15

–10

–5

0

5

10 15 Power (dB)

FIGURE 2.7 Experimental estimation of power noise distributions with no targets in the environment. (a) Experimental estimation of the noise distribution obtained from a 10 m distance. The distance has been chosen arbitrarily. (b) Experimental estimation of the noise distribution obtained from a 100 m distance. The distance has been chosen arbitrarily.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 56 — #16

Millimeter Wave RADAR Power-Range Spectra Interpretation

57

µ = 9.612 dB and variance, σ 2 = 28.239 dB2 . These ranges have been selected arbitrarily to show the noise distributions for shorter ( τZ

(2.11)

The range bin in Figure 2.15 was obtained from an environment containing a concrete wall at approximately 18 m. The detected features are indicated along with the adaptive threshold. The moving average will set the threshold above which targets are considered detected. Due to the phase noise, the power returned from the target is widened along the range axis, resulting in more feature detections at approximately 18 m. In Figure 2.15a and b, CFAR “picks out” features which lie at closest range. Features at a longer range, however, will not be detected as the noise power variance estimate by the CFAR processor becomes incorrect due to the range bin distortion caused by the high pass filter.

2.5.1 The Effect of the High Pass Filter on CFAR In general, since the gain of the high pass filter is not linear (Figure 2.6a) the sum of the noisy received power values in Equation (2.10) is inaccurate at higher ranges, which ultimately results in the missed detection of targets at these range values. This is evident from Figure 2.15b where CFAR detects a feature (corner reflector) at 10.25 m while it misses a feature (building) at 138 m. The second reflection is due to the beam-width of the RADAR, as part of the transmitted signal passes the corner reflector. It would therefore be useful to reduce the power–range bias before applying the CFAR method. Therefore, to correctly implement the CA-CFAR method here, first, the average of two noise only range bins can be obtained,6 the result of which should be subtracted from the range bin under consideration. This is carried out to obtain a range independent, high pass filter gain for the resultant bin. The CFAR method has been applied to the range bin of Figure 2.11b, the full 200 m bin of which is shown in Figure 2.16a, after subtracting the high 6 The noise only range bins are obtained by pointing the RADAR toward open space.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 65 — #25

66

Autonomous Mobile Robots

(a) 100 RADAR range bin Features detected 80 Adaptive threshold

Power (dB)

60

40

20

0

–20 0

20

40

60

80

100

120

140

160

180

200

Range (m)

FIGURE 2.15 CFAR target detection. (a) The detection of a target (concrete wall approximately at 18 m) using a CA-CFAR detector. A series of targets around the 18 m mark are obtained due to the phase noise in the returned peak. (b) The missed detection of a feature (a building at 138 m) by a CA-CFAR detector. Due to the gain of the high pass filter, the noise estimation is inaccurate at higher ranges resulting in missed detection of features.

pass filter bias of Figure 2.6a. This figure shows the result from an environment, containing a corner reflector at 10.25 m and a building at approximately 138 m. By reducing the high pass filter effect (range independent gain for all the ranges), the CFAR detection technique finds features regardless of range as shown in Figure 2.16a. It is clearly necessary to compensate for any nonideal high pass filter characteristics, in the form of power–range bias, before CA-CFAR can be applied correctly. Problems still arise however, as CFAR can misclassify targets as noise (missed-detection) and noise as targets (false-alarm). Both of these are evident and labeled in the CFAR results of Figure 2.16a. 2.5.1.1 Missed detections with CFAR In a typical autonomous vehicle environment the clutter level changes. As the RADAR beam width increases with range, the returned range bin may have multiple peaks from features.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 66 — #26

Millimeter Wave RADAR Power-Range Spectra Interpretation

67

(b) 100 RADAR range bin Features detected

Adaptive threshold 80

Power (dB)

60

40

20

0

–20

0

20

40

60

80

100

120

140

160

180

200

Range (m)

FIGURE 2.15 Continued.

From Figure 2.14, it can be seen that if two or more targets are separated by less than the window width M, the local power sum in Equation (2.10) will become large, causing the adaptive threshold to increase, resulting in a missed detection [29]. This is also shown in Figure 2.16b where a return from an object, which lies within M range samples of the first feature is completely missed by the CFAR detector. 2.5.1.2 False alarms with CFAR Due to the filtering elements within the RADAR, the power noise in the RADAR range bins is correlated. Therefore, if the window size is too small, all of its power–range samples will be highly correlated. This means that the sum of the power values, calculated in Equation (2.10), will misrepresent the true sum which would be obtained from a set of uncorrelated values. This can ultimately result in the adaptive threshold being set too low, meaning that even noise only power values can exceed it. This gives false alarms. This can be overcome by increasing the window width. However, as explained above, a larger window width can result in the missed detection of features. The occurrence of false alarms is shown in Figure 2.16a and b.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 67 — #27

68

Autonomous Mobile Robots

(a) 50 RADAR range bin Features detected

Corner reflector s = 10 m2

40

Features False alarms

Missed detection

30

Power (dB)

Adaptive threshold 20

10

0

–10

–20

0

20

40

60

80

100

120

140

160

180

200

Range (m)

FIGURE 2.16 Target estimation with CFAR. (a) The graph shows target detection using a CFAR detector. The effect of the high pass filter is removed from the range bin. (b) The figure shows a missed detection of a feature (at 38 m) by the CA-CFAR processor. The first feature is at 22 m and the second feature is at 38 m approximately. The effect of the high pass filter is removed from the range bin.

In general, the CFAR method tends to work well with aircraft in the air having relatively large RCS, while surrounded by air (with extremely low RCS). At ground level, however, the RCS of objects is comparatively low and also there will be clutter (objects which cannot be reliably extracted). Further, as the CFAR method is a binary detection technique, the output is either a one or a zero (Equation [2.11]), that is, no probabilistic measures are given for target presence or absence.

2.6 TARGET PRESENCE PROBABILITY ESTIMATION FOR TRUE TARGET RANGE DETECTION For typical outdoor environments, the RCS of objects may be small. The smaller returned power from these objects can be buried in noise. For reducing the

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 68 — #28

Millimeter Wave RADAR Power-Range Spectra Interpretation

69

(b) 50 RADAR range bin Features detected

Feature 40 Missed detection

False alarms

Power (dB)

30

20 Adaptive threshold 10

0

–10

–20

0

20

40

60

80

100

120

140

160

180

200

Range (m)

FIGURE 2.16 Continued.

noise and extracting smaller signal returns along with the higher power returns, a method is now introduced which uses the probability of target presence [30] for feature detection [15]. This method is appealing compared to CFAR and constant threshold methods at ground level, as a threshold can be applied on the target presence probability. By setting a threshold value to be dependent on target presence probability and independent of the returned power in the signal, a higher probability threshold value is more useful for target detection. The proposed method does not require manual assistance. The merits of the proposed algorithm will be demonstrated in the results in Section 2.6.1. The detection problem described here can be stated formally as a binary hypothesis testing problem [31]. Feature detection can be achieved by estimating the noise power contained in the range spectra. The noise is estimated by averaging past spectral power values and using a smoothing parameter. This smoothing parameter is adjusted by the target presence probability in the range bins. The target presence probability is obtained by taking the ratio between the local power of range spectra containing noise and its minimum value. The noise power thus estimated is then subtracted from the range bins to give a reduced noise range spectra.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 69 — #29

70

Autonomous Mobile Robots

Let the power of the noisy range spectra be smoothed by a w-point window function b(i) whose length is 2w + 1 ˘ l) = P(k,

w 

˘ − i, l) b(i)P(k

(2.12)

i=−w

˘ l) is the kth power value of lth range spectra. where P(k, Smoothing is then performed by a first order recursive averaging technique: ˘ l) = αs P(k, ˘ l − 1) + (1 − αs )P(k, ˘ l) P(k,

(2.13)

where αs is a weighting parameter (0 ≤ αs ≤ 1). First the minimum and temporary values of the local power are initialized to Pmin (k, 0) = Ptmp (k, 0) = ˘ 0). Then a range bin-wise comparison is performed with the present bin l P(k, and the previous bin l − 1. ˘ l)} Pmin (k, l) = min{Pmin (k, l − 1), P(k,

(2.14)

˘ l)} Ptmp (k, l) = min{Ptmp (k, l − 1), P(k,

(2.15)

When a predefined number of range bins have been recorded at the same vehicle location, and the same sensor azimuth, the temporary variable, Ptmp is reinitialized as ˘ l)} Pmin (k, l) = min{Ptmp (k, l − 1), P(k,

(2.16)

˘ l) Ptmp (k, l) = P(k,

(2.17)

˘ l)/Pmin (k, l) be the Let the signal-to-noise power (SNP), PSNP (k, l) = P(k, ratio between the local noisy power value and its derived minimum. In the Neyman–Pearson test [32], the optimal decision (i.e., whether target is present or absent) is made by minimizing the probability of the type II error (see Appendix), subject to a maximum probability of type I error as follows. The test, based on the likelihood ratio, is p(PSNP |H1 ) H1 ≷δ p(PSNP |H0 ) H0

(2.18)

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 70 — #30

Millimeter Wave RADAR Power-Range Spectra Interpretation

71

where δ is a threshold,7 H0 and H1 designate hypothetical target absence and presence respectively. p(PSNP |H0 ) and p(PSNP |H1 ) are the conditional probability density functions. The decision rule of Equation (2.18) can be expressed as H1

PSNP (k, l) ≷ δ

(2.19)

H0

An indicator function, I(k, l) is defined where, I(k, l) = 1 for PSNP > δ and I(k, l) = 0 otherwise. The estimate of the conditional target presence probability,8 pˆ  (k, l) is pˆ  (k, l) = αp pˆ  (k, l − 1) + (1 − αp )I(k, l)

(2.20)

This target presence probability can be used as a target likelihood within mobile robot navigation formulations. αp is a smoothing parameter (0 ≤ αp ≤ 1). The value of αp is chosen in such a way that the probability of target presence in the previous range bin has very small correlation with the next range bin (in this case αp = 0.1). It is of interest to note that, as a consequence of the above analysis, the noise power, λˆ d (k, l) in kth range bin is given by ˘ l) λˆ d (k, l) = α˜ d (k, l)λˆ d (k, l − 1) + [(1 − α˜ d (k, l))] P(k,

(2.21)

where α˜ d (k, l) = αd + (1 − αd )p (k, l)

(2.22)

and αd is a smoothing parameter (0 ≤ αd ≤ 1). This can be used to obtain a noise reduced bin, Pˆ NR (k, l) using the method of power spectral subtraction [34]. In the basic spectral subtraction algorithm, the average noise power, λˆ d (k, l) is subtracted from the noisy range bin. To overcome the inaccuracies in the noise power estimate, and also the occasional occurrence of negative power estimates, the following method can be used [35] 

˘ l) − c × λˆ d (k, l) if P(k, ˘ l) > c × λˆ d (k, l) P(k, Pˆ NR (k, l) = otherwise d × λˆ d (k, l) 7 This threshold can be chosen based upon the received SNP, at which the signal can be trusted not to be noise. Note that this does not have to be changed for differing environments, or types of targets. 8 Conditioned on the indicator function I(k, l) [33].

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 71 — #31

72

Autonomous Mobile Robots

where c is an over-subtraction factor (c ≥ 1) and d is spectral floor parameter (0 < d < 1). The values of c and d are empirically determined for obtaining an optimal noise subtraction level at all ranges and set to be 4 and 0.001. Although a reduced noise range bin can be useful in other detection methods, the target presence probability estimate (Equation [2.20]), will be demonstrated further in the results. This method shows improved performance over CFAR methods as the threshold can be applied on the target presence probability instead of SNP. Setting an arbitrary threshold value on the probability of target presence (≥0.8) is sufficient for target detection. Based on the results, this is a robust method and requires no adjustments when used in different environments.

2.6.1 Target Presence Probability Results The results of the proposed target detection algorithm are shown in Figure 2.17 where a noisy RADAR range bin (Figure 2.17a), the corresponding estimated target presence probability (Figure 2.17b) from Equation (2.20) and the reduced noise range spectra (Figure 2.17c) have been plotted. In Figure 2.17a, the range bin contains three distinct peaks of differing power values, whereas the target presence probability plot shows the three peaks with a more uniform range width and similar probabilistic values. This result shows that although the return power values varies from different objects, the corresponding target presence probability values will be similar. The target presence probability-based feature detector is easier to interpret as shown in Figure 2.18 and Figure 2.19 where the target presence probability plot is shown along with the corresponding raw RADAR data. Figure 2.18a and Figure 2.19a show the raw RADAR data obtained in an indoor sports hall and outdoor sports field, respectively. The corresponding target presence probabilities are shown in Figure 2.18b and Figure 2.19b, respectively. Figure 2.18b shows the target presence probability plot of an indoor stadium. The four walls of the stadium are clearly obtained by the proposed algorithm. The other probability values at higher ranges arise due to the multipath effects in the RADAR range spectrum. Figure 2.19b is obtained from an outdoor field. The detected features are marked in the figure. The clutter shown in Figure 2.19b is obtained when the RADAR beam hits the ground due the unevenness of the field surface. The merit of the proposed algorithm is shown in Figure 2.20 where plots obtained using different power thresholds applied to raw RADAR range spectra are shown and compared with the threshold (0.8) applied to the probability plot. Figure 2.20a shows the comparison of 2D plots obtained by choosing a constant threshold of 25 dB applied to the raw RADAR data and the target presence probability plot. Figure 2.20b shows the comparison of plots obtained by constant

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 72 — #32

Millimeter Wave RADAR Power-Range Spectra Interpretation

73

(a) 50

40

Power (dB)

30

20

10

0

–10

–20

0

20

40

60

80

100

120

140

160

180

200

140

160

180

200

Range (m) (b) 1

Probability

0.8

0.6

0.4

0.2

0 0

20

40

60

80

100 120 Range (m)

FIGURE 2.17 Received range bin, noise reduced bin, and the probability of target presence vs. range plot. (a) Received noisy RADAR range bin. (b) Target presence probability of the corresponding range bin. (c) Noise reduced RADAR range bin.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 73 — #33

74

Autonomous Mobile Robots (c) 70 60

Power (dB)

50 40 30 20 10 0 –10

0

20

40

60

80

100

120

140

160

180

200

Range (m)

FIGURE 2.17

Continued.

threshold of 40 dB applied against the raw RADAR data and the target presence probability. Further results conducted show the target presence probability of objects will be the same and is found to be more than 0.8. Feature detection using the target presence probability is then carried out by keeping the threshold at 0.8. The results shown in Figures 2.18 to 2.20 clearly show that the target presence probability-based feature detection is easier to interpret and has lower false alarms compared to constant threshold-based feature detection in the typical indoor and outdoor environments tested [36].

2.6.2 Merits of the Proposed Algorithm over Other Feature Extraction Techniques The constant threshold applied to raw RADAR data requires manual intervention for adjusting the threshold depending on the environment. In CA-CFAR, the averaging of power values in the cells provides an automatic, local estimate of the noise level. This locally estimated noise power is used to define the adaptive threshold (see e.g., Figure 2.16a). The test window compares the threshold with the power of the signal and classifies the cell content as signal or noise.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 74 — #34

Millimeter Wave RADAR Power-Range Spectra Interpretation

75

Power (dB)

(a)

50 0 –30

30 20

–20 10

–10

Di

nc

e(

m)

)

0

0

sta

–20

20 30

(m

n

ta

–10

10

ce

is

D

–30

FIGURE 2.18 Raw RADAR data and corresponding target presence probability plots obtained from an indoor sports hall. (a) Power vs. range of a 2D RADAR scan from an indoor environment. (b) Target presence probability vs. range of a 2D RADAR scan in indoor environment. The probability of the targets detected (i.e., walls) are shown in the figure.

When the signal and noise distributions are distinctly separated in range, CFAR works well. But when the signal and noise distributions lie close together, which is often the case at ground level (as shown in Figure 2.21), the method misclassifies noise as signal and vice versa. This is the reason for the poor performance of the CFAR technique with noisy RADAR data. Figure 2.22 shows features obtained by target presence probability and the CA-CFAR technique. The dots are the features obtained by target presence probability while the “+” signs are the features obtained from the CFAR-based target detector. From the figures it can be seen that the target presence-based feature detection has a superior performance to CA-CFAR detector in the environment tested. Figure 2.23 shows the difference between the ground truth and the range observation obtained from the target presence probability. The ground truth has been obtained by manually measuring the distance of the walls from the RADAR location. The peaks in Figure 2.23 are to some extent due to inaccurate ground truth estimates, but mainly due to multi-path reflections. The proposed algorithm for feature extraction appears to outperform the CFAR method because the CFAR method finds the noise locally, while the target presence probability-based feature detection algorithm estimates

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 75 — #35

76

Autonomous Mobile Robots

Probability

(b)

1 0.5 0 –30

30 20

–20 10 (m ce an

–10

10

Di

(m

)

20

–20 30

FIGURE 2.18

)

0

0 ce

st

–10 Di st an

–30

Continued.

the noise power by considering more than one range bin (Equation [2.16]). The target presence probability-based feature extraction, unlike the CFAR detector, is not a binary detection process as is shown in Figure 2.17c. This method of feature detection is useful in data fusion as the feature representation is probabilistic.

2.7 MULTIPLE LINE-OF-SIGHT TARGETS — RADAR PENETRATION At 77 GHz, millimeter waves can penetrate certain nonmetallic objects, which sometimes explains the multiple line-of-sight objects within a range bin.9 This limited penetration property can be exploited in mobile robot navigation in outdoor unstructured environments, and is explored further here. For validating the target penetration capability of the RADAR, tests were carried out with two different objects. In the section of the RADAR scan, shown in Figure 2.24a, a RADAR reflector of RCS 177 m2 and a sheet of 9 Although it should be noted that these can be the results of specular and multiple path reflections also.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 76 — #36

Millimeter Wave RADAR Power-Range Spectra Interpretation

77

(a)

30

m)

10

50 0 –30

nc

e(

0 –20 –10

0 Dist anc e (m

Di

–10

sta

Power (dB)

20

)

–20

10 20

30 –30

FIGURE 2.19 Raw RADAR data and corresponding target presence probability obtained from an outdoor environment. (a) Power vs. range of a 2D RADAR scan from an outdoor environment. (b) Target presence probability vs. range of a 2D RADAR scan in outdoor environment. The probability of the targets detected (i.e., RADAR reflectors, wall, and tree) are shown in the figure.

wood of thickness 0.8 cm were placed at ranges of 14 and 8.5 m respectively, to visually occlude the reflector from the RADAR. This ensured that no part of the RADAR reflector fell directly within the beam width of the RADAR, so that if it was detected, it must be due to the radio waves penetrating the wood. Figure 2.24a shows the detection of the two features down-range even though, visually, one occludes the other. The experiment was also repeated for a perspex sheet of thickness 0.5 cm (Figure 2.24b). The results of object penetration by RADAR waves motivates further development of power spectra prediction with multiple line-of-sight features which is one of the contributions of this chapter. For feature-based SLAM, it is necessary to predict the target/feature locations reliably, given a prediction of the vehicle/RADAR location. As RADAR can penetrate certain nonmetallic objects it can give multiple range information. A method for predicting the power–range spectra (or range bins) using the RADAR range equation and knowledge of various noise distributions in the RADAR has already been explained in this chapter. For SLAM, the measurements taken from the RADAR used here are the range, R, bearing, θ, and the received power, PR , from the target at range R.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 77 — #37

78

Autonomous Mobile Robots

(b)

Trees Wall RADAR reflectors

30

(m)

10

1 0.5 0 –30

–10

–10 Clutter

Dis

0 ce ( m) 10

tan

tan

ce

0 –20

Dis

Probability

20

–20 20 30

FIGURE 2.19

–30

Continued.

One of the contributions of this chapter is to predict range bins from new robot positions given an estimate of the vehicle and target states. A new augmented state vector is introduced here which, along with the usual feature coordinates x and y, contains that feature’s normalized RCS, ϒR , and absorption RCS, ϒa , and the RADAR losses, L. To illustrate this, Figure 2.25 shows a 360◦ RADAR scan obtained from an outdoor field. Objects in the environment consist of lamp-posts, trees, fences, and concrete steps. The RADAR penetrates some of the nonmetallic objects,10 and can observe multiple targets down line. This is shown in Figure 2.26, which is the received power vs. range for the particular bearing of 231◦ marked in Figure 2.25. Multiple targets down range can occur due to either the beam width of the transmitted wave intersecting two or more objects at differing ranges or due to penetration of the waves through certain objects. The RADAR used here 10 At 77 GHz the attenuation through paper, fiberglass, plastic, wood, glass, foliage, etc., are

relatively low while attenuation through brick and concrete is high [37].

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 78 — #38

Millimeter Wave RADAR Power-Range Spectra Interpretation (a)

79

Indoor stadium Const. threshold on raw data Threshold on probability data

20 15

Y distance (m)

10 5 0 –5

–10 –15 –20 –25 –30

–20

–10

0 X distance (m)

10

20

30

FIGURE 2.20 Target presence probability vs. range spectra and the corresponding power vs. range taken from a 2D RADAR scan in an indoor environment. The figures shows a comparison of the proposed feature detection algorithm with the constant threshold method. (a) A constant power threshold of 25 dB is chosen and compared with the threshold (0.8) applied on probability-range spectra. (b) A constant power threshold of 40 dB is chosen and compared with the threshold applied to the probability–range spectra.

is a pencil beam device, with a beam width of 1.8◦ . This means that multiple returns within the range spectra occur mostly due to penetration. Therefore a model for predicting entire range spectra, based on target penetration is now given.

2.8 RADAR-BASED AUGMENTED STATE VECTOR The state vector consists of the normalized RADAR cross section, ϒR , absorption cross section, ϒa , and the RADAR loss constants, L, along with the vehicle state and feature locations. The variables, ϒR , ϒa , and L are assumed unique to a particular feature/RADAR. Hence, this SLAM formulation makes the (very) simplified assumption that all features are stationary and that the changes in the normalized values of RCS and absorption cross sections of features when sensed from different angles, can be modeled using Gaussian random variables vϒi .

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 79 — #39

80

Autonomous Mobile Robots (b)

Indoor stadium Const. threshold on raw data Threshold on probability data

60

Y distance (m)

40

20

0

–20

–40

–80

–60

–40

–20

0

20

40

60

X distance (m)

FIGURE 2.20

Continued.

This is a reasonable assumption only for small circular cross sectioned objects such as trees, lamp posts, and pillars, however, as will be shown the method produces good results in semi-structured environments even for the targets which do not conform to these assumptions. The SLAM formulation here can handle multiple line-of-sight targets.

2.8.1 Process Model A simple vehicle predictive state model is assumed with stationary features surrounding it. The vehicle state, xv (k) is given by xv (k) = [x(k), y(k), θR (k)]T where x(k), y(k), and θR (k) are the local position and orientation of the vehicle at time k. The vehicle state, xv (k) is propagated to time (k + 1) through a simple steering process model [38]. The model, with control inputs, u(k) predicts the vehicle state at time (k +1) together with the uncertainty in vehicle location represented in the covariance matrix P(k + 1) [39]. xv (k + 1) = f(xv (k), u(k)) + v(k)

(2.23)

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 80 — #40

Millimeter Wave RADAR Power-Range Spectra Interpretation

81

PDFs for noise and signal + noise

1400

Signal + noise distribution 1200

Number

1000 800 Noise distribution

600 400 200 0 –15

–10

–5

0

5

10

15

20

25

30

Power (dB)

FIGURE 2.21 Experimental estimation of signal and noise distributions. In the CFAR method, the local noise-plus-clutter power (Equation [2.10]) in the window is used to set the detection threshold. The method compares the signal in the test window and the detection threshold. The method fails when there are multiple detections within a range-bin and in cluttered environments.

u(k) = [v(k), α(k)]. v(k) is the velocity of the vehicle at time k and α(k) is the steering angle. In full, the predicted state at time, (k + 1) becomes ⎡





⎤ x(k) ⎢ ⎥ ⎢ ⎥ ⎢y(k)⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ α(k) ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ xp1 (k + 1|k) ⎥ ⎢ xp1 (k|k) ⎥ ⎢ 0p1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ y (k + 1|k) ⎥ ⎢ y (k|k) ⎥ ⎢ 0p ⎥ ⎢ p1 ⎥ ⎢ p1 ⎥ ⎢ 1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ϒR1 (k + 1|k) ⎥ ⎢ ϒR1 (k|k) ⎥ ⎢ 0p1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ϒa1 (k + 1|k) ⎥ ⎢ ϒa1 (k|k) ⎥ ⎢ 0p1 ⎥ ⎢ ⎥=⎢ ⎥+⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ . ⎥ .. .. . ⎢ ⎥ ⎥ ⎥ ⎢ ⎢ . . ⎢ ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ xpN (k + 1|k) ⎥ ⎢ xpN (k|k) ⎥ ⎢ 0pN ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ypN (k + 1|k) ⎥ ⎢ ypN (k|k) ⎥ ⎢ 0pN ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ϒ (k + 1|k)⎥ ⎢ϒ (k|k)⎥ ⎢ 0 ⎥ ⎢ RN ⎥ ⎢ RN ⎥ ⎢ pN ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ϒaN (k + 1|k) ⎦ ⎣ ϒaN (k|k) ⎦ ⎣ 0pN ⎦ 0 L(k + 1|k) L(k|k) xˆ (k + 1|k) yˆ (k + 1|k) θˆR (k + 1|k)



xˆ (k|k) yˆ (k|k) θˆR (k|k)



(2.24)

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 81 — #41

82

Autonomous Mobile Robots Ground truth

90

40 60

120 30 20

150

30

10 180

0

210

330

240

300 270

Target presence probability CA-CFAR

FIGURE 2.22 Comparison of CA-CFAR detector-based feature extraction and feature detection based on target presence probability.

25 20

Error (m)

15 10 5 0 –5 –10

0

50

100

150

200

250

300

350

Azimuth (degrees)

FIGURE 2.23 The difference between the ground truth range values and the range estimates from the target presence probability.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 82 — #42

Millimeter Wave RADAR Power-Range Spectra Interpretation

83

(a) 40 20

Range (m)

30 15

20 Mud bank 10

10

5

RADAR reflector

Wooden plane

0

–10 –4

–3

–2

–1

0

1

2

3

4

Range (m)

(b) 25 40 20

Range (m)

30 Mud bank

15

RADAR reflector

10

20

10

Perspex sheet

0

5 –10 –4

–3

–2

–1

0

1

2

3

4

Range (m)

FIGURE 2.24 Initial test results carried out to show the RADAR wave penetration with different objects. (a) A scan of a RADAR reflector of RCS 177 m2 , 14 m from the RADAR, and a wooden sheet of thickness 0.8 cm visually occluding the reflector from the RADAR. The wooden sheet is 8.5 m from the RADAR. (b) A RADAR reflector of RCS 177 m2 , 14 m from the RADAR, and a perspex sheet of thickness 0.5 cm, 8.5 m from the RADAR. Again, the reflector is visually occluded from the RADAR.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 83 — #43

84

Autonomous Mobile Robots 0° scan start

200

50

150

40

100

Range (m)

50 0

30

Concrete step Trees

–50

20 Nonmetallic pole 10

Fence –100

Fence 0

Range bin at –150 231°

–10 –200 –200 –150 –100

–50

0

50

100

150

200

Range (m)

FIGURE 2.25 A 360◦ RADAR range spectra obtained from an outdoor field, containing trees, nonmetallic poles, fences, and concrete walls. The received power value is represented in color space, as shown by the right hand color bar, with power units in decibel.

where x(k) = v(k)t cos(θˆR (k|k) + α(k)), y(k) = v(k)t sin(θˆR (k|k) + α(k)) and t is the sampling time. The augmented state vector is then x(k) = [xv , {F1 , ϒR1 , ϒa1 }, . . . , {Fi , ϒRi , ϒai }, . . . , {FN , ϒRN , ϒaN }, L]T where xv is the vehicle’s pose, Fi = [xpi , ypi ]T is the ith feature’s location, where 1 ≤ i ≤ N. ϒRi is the normalized RCS of the ith feature, ϒai is its normalized absorption cross section, L represents the RADAR loss, and v(k) = [vv (k), 0p1 , 0p1 , vϒR1 , vϒa1 , . . . , 0pi , 0pi , vϒRi , vϒai , . . . , 0pN , 0pN , vϒRN , vϒaN , 0]T .

2.8.2 Observation (Measurement) Model Another contribution of this chapter is the formulation of the observation model. The RADAR observation is used to estimate the vehicle’s state once the vehicle’s pose is predicted. During filter update, the prediction and estimation are fused. For each of the features in the map, the predicted range, Rˆi (k + 1|k), the RADAR bearing angle, βˆi (k + 1|k), and the power, Pˆi (k + 1|k) are to be

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 84 — #44

Millimeter Wave RADAR Power-Range Spectra Interpretation

85

50 (A) Nonmetallic pole

(B) Fence

40

(C) Lamp post

Power (dB)

30

20

10

0

–10

–20

0

20

40

60

80

100

120

140

160

180

200

Range (m)

FIGURE 2.26 A single RADAR range bin, recorded at the bearing angle 231◦ shown in Figure 2.25, obtained from the outdoor field with multiple features down-range.

predicted from the predicted state in Equation (2.24). The predicted range and bearing observations are similar to the ordinary SLAM formulation, that is,  Rˆ i (k + 1|k) = [ˆxpi (k +1|k)− xˆ R (k +1|k)]2 + [ˆypi (k + 1|k)− yˆ R (k + 1|k)]2 (2.25) βˆi (k + 1|k) = θˆR (k + 1|k) − tan−1



yˆpi (k + 1|k) − yˆ R (k + 1|k) xˆpi (k + 1|k) − xˆ R (k + 1|k)

 (2.26)

The predicted power for all targets, such as those in Figure 2.26, is the fundamental difference offered in this chapter. 2.8.2.1 Predicted power observation formulation The assumptions made in the predicted power model are as follows: • The environmental features of interest are assumed to have small circular cross-sections, so that the estimated normalized RCS

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 85 — #45

86

Autonomous Mobile Robots

sections and absorption coefficients are approximately the same in all directions with respect to that feature. • The measured returned power should be independent of range (due to the built-in range compensation filter). This filter must first be removed or post-filtered to remove its effect, to produce range dependent power returns from all objects [15]. • The beam-width of the RADAR wave does not increase considerably with range. A target is assumed to affect the incident electromagnetic radiation in three possible ways: 1. A portion of the incident energy ϒR , 0 ≤ ϒR ≤ 1, is reflected and scattered 2. A portion of the incident energy ϒa , 0 ≤ ϒa ≤ 1, is absorbed by the target 3. A portion of the incident energy 1 − (ϒR + ϒa ) is further transmitted through the target ϒR is thus referred to as the “normalized” RCS section. Figure 2.27 shows a MMW RADAR in an environment with i-features down-range at a particular bearing. The following terms are used in formulating the predicted power observation: PINCi = Power incident on the ith feature PREFi = Power reflected from the ith feature PTRANi = Power transmitted through the ith feature PINCi1 = Power incident on the first feature which is reflected from the ith feature • PREFi1 = Power reflected back toward the ith feature from the first feature. This component will not reach the RADAR receiver directly and is not considered in this formulation • PTRANi1 = Power transmitted through the first feature which is the reflection from the ith feature

• • • •

The power incident at the first feature is given by PINC1 =

Pt GAI 4π R1 2

(2.27)

where Pt is the power transmitted by the RADAR, G is the antenna gain, and R1 is the distance between RADAR and the first feature and AI is the area of the object illuminated by the RADAR wave. Let ϒR1 be the normalized

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 86 — #46

Millimeter Wave RADAR Power-Range Spectra Interpretation MMW RADAR

1

2

i

PINC2

PINC1

87

PTRANi

PTRAN1

PTRAN2 PREF2

PREF1 PINC21 PTRAN21 PREF21

PINC32

PTRAN32 PREF32 PINCi1

PTRANi1

PREFi1

FIGURE 2.27 Power definitions for reflections, absorptions, and transmissions for i multiple line-of-sight features.

RCS and ϒa1 be the normalized absorption cross section of the first feature. The power received by the RADAR receiver from the first feature is given by  Pˆ REF1 = PREF1 Ae /(4π R12 ) Pt Gϒˆ R1 AI  Pˆ REF1 = Ae (4π )2 Rˆ 14

(2.28)

where Ae is the antenna aperture. It is shown in the RADAR literature that Ae = Gλ2 /4π [21]. Substituting for Ae in Equation (2.28), the power return from the first feature is Pt G2 λ2 ϒˆ R1 AI  Pˆ REF1 = (4π )3 Rˆ 14

(2.29)

The power PTRAN1 that passes through the first feature is given by PTRAN1 =

Pt GAI (1 − [ϒˆ R1 + ϒˆ a1 ]) (4π )Rˆ 2

(2.30)

1

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 87 — #47

88

Autonomous Mobile Robots

The power reflected from the second feature, PREF2 is given by PREF2 =

Pt GAI 2 ϒˆ R2 (1 − [ϒˆ R1 + ϒˆ a1 ]) 2

(4π )2 Rˆ 12 (Rˆ 2 − Rˆ 1 )

(2.31)

The power then transmitted back to the first feature from the second feature is given by PINC21 =

Pt GAI 3 ϒˆ R2 (1 − [ϒˆ R1 + ϒˆ a1 ]) 4 (4π )3 Rˆ 12 (Rˆ 2 − Rˆ 1 )

(2.32)

The power, PINC21 then passes through feature 1 and is given by PTRAN21 = PINC21 (1 − [ϒˆ R1 + ϒˆ a1 ])

(2.33)

 = PTRAN21 Ae / The power returned from the second feature is then Pˆ TRAN21 2 (4π Rˆ 1 )

Pt GAI 3 Ae ϒˆ R2 (1 − [ϒˆ R1 + ϒˆ a1 ])2  Pˆ TRAN21 = (4π )4 Rˆ 14 (Rˆ 2 − Rˆ 1 )4

(2.34)

In general, the predicted power from the ith feature can be written as KAI  Pˆ TRANi1 (k + 1|k) =

i−1 ×

ϒˆ Ri (k + 1|k) (4π )2i

(2i−1)

2 ˆ ˆ j=0 [1 −(ϒRj (k +1|k)+ ϒaj (k + 1|k))] i−1 4 ˆ ˆ j=0 (Rj+1 (k + 1|k) − Rj (k + 1|k))

(2.35)

where K = Pt GAe , Ae = Gλ2 /4π , ϒˆ R0 = ϒˆ a0 = Rˆ 0 = 0 and, for the ith feature, Rˆ i is related to the augmented state by Equation (2.25). Equation (2.25), Equation (2.26), and Equation (2.35) between them comprise the observation. In order to generate realistic predictions of the range bins, knowledge of the power and range noise distributions is necessary. This has been studied extensively in previous work, and can be found in Reference 15. The range and power noise are experimentally obtained [15]. The noise in range is the phase noise, which is obtained by observing the range bins

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 88 — #48

Millimeter Wave RADAR Power-Range Spectra Interpretation

89

containing reflections from objects with different RCSs at different locations. The noise statistics in power is obtained during both target presence and absence. The angular standard deviation is assumed to be 1◦ as the RADAR wave is a pencil beam. The observation model is then given by zi (k + 1) = [Ri (k + 1), βi (k + 1), Pi (k + 1)]T + wi (k + 1) = h(x(k + 1)) + wi (k + 1)

(2.36)

where zi (k + 1) is the observation, and wi (k + 1) is the additive observation noise given by wi (k + 1) = [vR (k + 1)vβ (k + 1)vp (k + 1)]T

(2.37)

and h is the nonlinear observation function defined by Equation (2.25), Equation (2.26), and Equation (2.35).

2.9 MULTI-TARGET RANGE BIN PREDICTION — RESULTS To validate the formulation for realistically predicting multiple line-of-site target range bins, tests using a RADAR unit from Navtech Electronics were carried out. Initially the vehicle was positioned at pose xv (k) as demonstrated in Figure 2.28. The full 360◦ RADAR scan obtained from this vehicle location is shown in Figure 2.25. Range bins obtained from the initial vehicle location at two different bearing angles are shown in Figure 2.26 and Figure 2.29a. Figure 2.26 is obtained at azimuth 231◦ and is indicated by the black line in Figure 2.25. Features in the environment are marked in the figures. The next predicted vehicle location is calculated using the vehicle model and system inputs (Equation [2.24]). This corresponds to the new predicted vehicle pose xˆ v (k + 1 | k) in Figure 2.28. The range spectra in all directions are then predicted from the new predicted vehicle location. For example, in the range bin ˆ + 1 | k) in Figure 2.28, the predicted values for the predicted at angle β(k range, bearing and received power of features A and D are calculated according to Equation (2.25), Equation (2.26), and Equation (2.35). A single range prediction obtained from the predicted vehicle location xv (k + 1 | k) is shown in Figure 2.29b having two features down-range. Equation (2.35) can be used to predict the received power as long as the power bias as a function of range incorporated into the RADAR electronics is taken into account. This simply requires knowledge of the RADAR’s high pass filter circuitry which in an FMCW RADAR compensates for the fourth power of range loss, expected according to the simple RADAR Equation [15, 21].

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 89 — #49

90

Autonomous Mobile Robots y Predicted range

xv(k) b(k)

R2(k +1|k) Predicted range bin D xv(k +1|k)

E

A

F Range bin 2 at 233°

b(k +1|k)

Predicted range B

R1(k +1|k)

C Range bin 1 at 231° x

FIGURE 2.28 Vehicle motion and the features observed/predicted. observed/predicted down-range at different bearings are marked.

(a)

50

Features

E

D

F

40

Power (dB)

30 20 10 0 –10 –20

0

20

40

60

80

100 120 Range (m)

140

160

180

200

FIGURE 2.29 Observed and one step ahead predicted range spectra. (a) RADAR range spectra (233◦ azimuth) obtained at the starting robot location. Two features observed down-range are marked. (b) Predicted RADAR range spectra (at 234◦ bearing) obtained from the predicted vehicle location.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 90 — #50

Millimeter Wave RADAR Power-Range Spectra Interpretation

91

(b) 50 A

D

40

Power (dB)

30

20

10

0

–10

–20

0

20

40

60

80

100

120

140

160

180

200

Range (m)

FIGURE 2.29 Continued.

The actual observation is obtained from the next vehicle location and is shown in Figure 2.30a which shows power peaks in close proximity to those predicted in Figure 2.29b. The predicted and actual received powers from the target at A are in close agreement in both figures whereas, the predicted value for the received power (30 dB) of the target at 58 m (feature D in Figure 2.29b) is slightly less than the actual received power (38 dB) in Figure 2.30a. The discrepancy for feature D can be due to violation of some of the assumptions made in the formulation — in particular that the normalized reflection and absorption cross-sections remain constant, independent of the RADAR to target angle of incidence. Figure 2.30b shows the results of a chi-squared test to determine any bias or inconsistency in the power–range bin predictions. The difference between the measured and the predicted range bins is plotted together with 99% confidence interval. The value of 99% bound, = ±16.35 dB, has been found experimentally by recording several noisy power–range bins in target absence (RADAR pointing toward open space) [15]. Close analysis of Figure 2.30b shows that the error has a negative bias. This is due to the approximate assumption of the high pass filter gain. For the RADAR used here, the gain of the high pass filter

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 91 — #51

92

Autonomous Mobile Robots

(a) 50

40

A

D

Power (dB)

30

20

10

0

–10

–20

0

20

40

60

80

100

120

140

160

180

200

Range (m)

FIGURE 2.30 An actual range bin and the error between the predicted and observed spectra. (a) Actual RADAR range spectra (at 234◦ bearing) obtained at the next robot location. Features observed down-range are marked. (b) The difference between predicted and measured range bins containing two features down-range is shown. This error is compared against 3σ noise power bounds.

used in the predicted power–range bins was set to 60 dB/decade.11 The result shows that this approximation for the high pass filter gain is acceptable, as a large portion of the error plot lies within the 3σ limits. This formulation and analysis shows the initial stages necessary in implementing an augmented state, feature rich SLAM formulation with MMW RADAR. Future work will address the ease with which data association can be carried out using the multidimensional feature state estimates, and a full SLAM implementation in outdoor environments, will be tested. 11 Assuming the RADAR range equation to be correct, a high pass filter with a gain of 40 dB/decade

should produce a flat power response for particular targets at various ranges. Figure 2.26 shows a power–range spectrum recorded from the RADAR. It can be seen from Figure 2.26, that the power range response is not flat. For this particular RADAR it makes sense to either determine the bias in the power–range spectra or, model the high pass filter as having a gain of 60 dB/decade, which would better approximate the power–range relationship actually produced in Figure 2.26.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 92 — #52

Millimeter Wave RADAR Power-Range Spectra Interpretation

93

(b) 30

20

Error (dB)

10 3s bound 0

0

20

40

60

80

100

120

140

160

180

200

Range (m)

FIGURE 2.30 Continued.

2.10 CONCLUSIONS This chapter describes a new approach in predicting RADAR range bins which is essential for SLAM with MMW RADAR. A noise analysis during signal absence and presence was carried out. This is to understand the MMW RADAR range spectrum and to predict it accurately as it is necessary to know the power and range noise distributions in the RADAR power–range spectra. RADAR range bins are then simulated using the RADAR range equation and the noise statistics, which are then compared with real results in controlled environments. In this chapter, it is demonstrated that it is possible to provide realistic predicted RADAR power/range spectra, for multiple targets down-range. Feature detection based on target presence probability was also introduced. Results are shown which compare probability-based feature detection with other feature extraction techniques such as constant threshold on raw data and CFAR techniques. A difficult compromise in the CA-CFAR method is the choice of the window size which results in a play-off between false alarms and missed detections. Variants of the CFAR method exist, which can be tuned to overcome the problem of missed detections, but the problem of false alarms remains inherent to these methods. The target presence probability algorithm presented here does not rely on adaptive threshold techniques, but estimates the probability of target presence

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 93 — #53

94

Autonomous Mobile Robots

based on local signal-to-noise power estimates, found from several range bins. The results show that the algorithm can detect features in the typically cluttered outdoor environments tested, with a higher success rate compared to the constant threshold and CFAR feature detection techniques. A SLAM formulation using an augmented state vector which includes the normalized RCS and absorption cross-sections of features, as well as the usual feature Cartesian coordinates, was introduced. This is intended to aid the data association process, so that features need not just be associated based on their Cartesian coordinates, but account can be taken of their estimated normalized reflection and absorption cross-sections also. The final contribution is a predictive model of the form and magnitudes of the power–range spectra from differing vehicle locations, for multiple lineof-sight targets. This forms a predicted power–range observation, based on estimates of the augmented SLAM state. The formulation of power returns from multiple objects down-range is explained and predicted RADAR range spectra are compared with real spectra, recorded outdoors. This work is a step toward building reliable maps and localizing a vehicle to be used in mobile robot navigation. Further methods of including the target presence probability of feature estimates into SLAM are being investigated.

APPENDIX The binary hypothesis testing problem is a special case of decision problems. The decision space consists of target presence and target absence represented by δ0 and δ1 , respectively. There is a hypothesis corresponding to each decision. H0 is called null hypothesis (hypothesis accepted by choosing decision δ0 ) and H1 is called the alternative hypothesis. The binary hypothesis problem has four possible outcomes: • • • •

H0 was true, δ0 is chosen : correct decision. H1 was true, δ1 is chosen : correct decision. H0 was true, δ1 is chosen : False alarm, also known as a type I error. H1 was true, δ0 is chosen : missed detection also known as a type II error.

ACKNOWLEDGMENTS This work was funded under the first author’s AcRF Grant, RG 10/01, Singapore. We gratefully acknowledge John Mullane for providing some of the outdoor RADAR scans and Javier Ibanez-Guzman, SIMTech Institute of Manufacturing Technology, Singapore, for use of the utility vehicle. We further acknowledge the valuable advice from Graham Brooker (Australian Centre for Field Robotics) and Steve Clark (Navtech Electronics, UK).

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 94 — #54

Millimeter Wave RADAR Power-Range Spectra Interpretation

95

REFERENCES 1. J. J. Leonard and Hugh F. Durrant-Whyte. Dynamic map building for an autonomous mobile robot. In IEEE International Workshop on Intelligent Robots and Systems, pp. 89–96, Ibaraki, Japan, July 1990. 2. Fan Tang, Martin Adams, Javier Ibanez-Guzman, and Sardha Wijesoma. Pose invariant, robust feature extraction from range data with a modified scale space approach. In IEEE International Conference on Robotics and Automation (ICRA), New Orleans, USA, April 2004. 3. Alex Foessel, Sachin Chheda, and Dimitrios Apostolopoulos. Short-range millimeter-wave radar perception in a polar environment. In Proceedings of the Field and Service Robotics Conference, pp. 133–138, Pittsburgh, PA, USA, August 1999. 4. Steve Clark and Hugh F. Durrant-Whyte. Autonomous land vehicle navigation using millimeter wave radar. In International Conference on Robotics and Automation (ICRA), pp. 3697–3702, Leuven, Belgium, 1998. 5. Graham Brooker, Mark Bishop, and Steve Scheding. Millimetre waves for robotics. In Australian Conference for Robotics and Automation, Sydney, Australia, November 2001. 6. J. Leonard and Hugh F. Durrant-Whyte. Directed Sonar Sensing for Mobile Robot Navigation., Kluwer Academic Publishers, Dordrecht, 1992. 7. Somajyoti Majumder. Sensor Fusion and Feature Based Navigation for Subsea Robots, The University of Sydney, August 2001. 8. Scott Boehmke, John Bares, Edward Mutschler, and Keith Lay. A high speed 3D radar scanner for automation. In Proceedings of ICRA ’98, Vol. 4, pp. 2777–2782, May 1998. 9. Steve Clark and G. Dissanayake. Simultaneous localisation and map building using millimetre wave radar to extract natural features. In IEEE International Conference on Robotics and Automation (ICRA), pp. 1316–1321, Detroit, Michigan, May 1999. 10. Steve Clark. Autonomous Land Vehicle Navigation Using Millimetre Wave Radar. PhD thesis, Australian Centre for Field Robotics, University of Sydney, 1999. 11. Alex Foessel. Scene Modeling from Motion-Free Radar Sensing. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, January 2002. 12. Alex Foessel, John Bares, and William Red L. Whittaker. Three-dimensional map building with MMW radar. In Proceedings of the 3rd International Conference on Field and Service Robotics, Helsinki, Finland, June 2001. Yleisjljenns, Painnoprssi. 13. Hans Moravec and A. E. Elfes. High resolution maps from wide angle sonar. In Proceedings of the 1985 IEEE International Conference on Robotics and Automation, pp. 116–121, March 1985. 14. Sebastian Thrun. Learning occupancy grids with forward models. In Proceedings of the Conference on Intelligent Robots and Systems, Hawaii, 2001. 15. Ebi Jose and Martin D. Adams. Millimetre wave radar spectra simulation and interpretation for outdoor slam. In International Conference on Robotics and Automation (ICRA), New Orleans, USA, April 2004.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 95 — #55

96

Autonomous Mobile Robots 16. Steve Clark and Hugh F. Durrant-Whyte. The design of a high performance MMW radar system for autonomous land vehicle navigation. In FSR’97 Proceedings of the International Conference on Field and Service Robotics, A. Zelinsky, Ed. Australian Robotic Association Inc, Sydney, NSW, Australia, pp. 292–299, 1997. 17. S. Scheding, G. Brooker, M. Bishop, R. Hennessy, and A. Maclean. Terrain imaging millimetre wave radar. In International Conference on Control, Automation, Robotics and Vision, Singapore, November 2002. 18. Dirk Langer. An Integrated MMW Radar System for Outdoor Navigation. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, January 1997. 19. M. E. Adamski, K. S. Kulpa, M. Nalecz, and A. Wojtkiewicz. Phase noise in twodimensional spectrum of video signal in FMCW homodyne radar. In 13th International Conference on Microwaves, Radar and Wireless Communications, (MIKON-2000), Vol. 2, pp. 645–648, 2000. 20. K. Nakamura, T. Hara, M. Yoshida, T. Miyahara, and H. Ito, Optical frequency domain ranging by a frequency-shifted feedback laser. IEEE Journal of Quantum Electronics, 36, 305–316, 2000. 21. M. I. Scolnik. Introduction to Radar Systems. McGraw Hill, New York, 1982. 22. F. R. Connor. Noise. Introductory Topics in Electronics and Telecommunications, 2nd ed. Edward Arnold, 1982. 23. Douglas C. Montgomery and George C. Runger. Applied Statistics and Probability for Engineers. John Wiley and Sons, Inc, 2nd ed., 1999. 24. N. C. Currie and C. E. Brown. Principles and Applications of MMW Radar. Artech House, Dedham, MA, 1987. 25. P. P. Gandhi and S. A. Kassam. Analysis of CFAR processors in nonhomogeneous background. IEEE Transactions on AES, 4, 427–445, 1988. 26. G. Davidson, H. D. Griffiths, and S. Ablett. Analysis of high-resolution land clutter. IEE Proceedings — Visual Image Signal Processing, 151, 86–91, 2004. 27. P. P. Gandhi and S. A. Kassam. Optimality of the cell averaging CFAR detector. IEEE Transactions on Information Theory, 40, 1226–1228, 1994. 28. S. Watts. The performance of cell-averaging CFAR systems in sea clutter. In The Record of the IEEE 2000 International Radar Conference, Alexandria, VA, USA, May 2000. 29. H. Rohling and R. Mende. Os CFAR performance in a 77 GHz radar sensor for car application. In CIE International Conference of Radar, pp. 109–114, October 1996. 30. Israel Cohen and Baruch Berdugo. Noise estimation by minima controlled recursive averaging for robust speech enhancement. IEEE Signal Processing Letters, 9, 12–15, 2002. 31. H. L. Van Trees. Detection, Estimation and Modulation Theory — Part I. Wiley, New York, 1968. 32. T. Kirubarajan and Y. Bar-Shalom. Multisensor-Multitarget Statistics in Data Fusion Handbook. CRC Press, Boca Raton, FL, 2001. 33. Robert M. Gray and Lee D. Davisson. Introduction to Statistical Signal Processing. Cambridge University Press, London, New York, December 2004.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 96 — #56

Millimeter Wave RADAR Power-Range Spectra Interpretation

97

34. S. F. Boll. Supression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustic, Speech and Signal Processing, ASSP-27, 113–120, 1979. 35. M. Berouti, R. Schwartz, and J. Makhoul. Enhancement of speech corrupted by acoustic noise. In Proceedings of the IEEE ICASSP’79, pp. 208–211, Washington, USA, 1979. 36. Ebi Jose and Martin D. Adams. Relative radar cross section based feature identification with millimetre wave radar for outdoor slam. In International Conference on Intelligent Robots and Systems (IROS), Sendai, Japan, September 2004. 37. D. D. Ferris, Jr. and N. C. Currie. Microwave and millimeter-wave systems for wall penetration. In Proceedings of the SPIE — The International Society for Optical Engineering, Vol. 3375, pp. 269–279, Orlando, FL, USA, 1998. 38. J. Guivant, E. Nebot, and S. Baiker. Autonomous navigation and map building using laser range sensors in outdoor applications. Journal of Robotic Systems, 17, 565–583, 2000. 39. Y. Bar-Shalom and T. E. Fortmann. Tracking and Data Association. Academic Press, New York, 1988.

BIOGRAPHIES Martin Adams is an associate professor in the School of Electrical and Electronic Engineering, Nanyang Technological University (NTU), Singapore. He obtained his first degree in engineering science at the University of Oxford, U.K. in 1988 and continued to study for a D.Phil. at the Robotics Research Group, University of Oxford, which he received in 1992. He continued his research in autonomous robot navigation as a project leader and part time lecturer at the Institute of Robotics, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland. He was employed as a guest professor and taught control theory in St. Gallen (Switzerland) from 1994 to 1995. From 1996 to 2000, he served as a senior research scientist in robotics and control, in the field of semiconductor assembly automation, at the European Semiconductor Equipment Centre (ESEC), Switzerland. He is currently the principal investigator of two robotics projects at NTU, both of which involve sensor data interpretation for SLAM and other mobile robot applications. His other research interests include active LADAR design, range data processing and data fusion with inertial navigation systems and other aiding devices. Ebi Jose is a Ph.D. student at the School of Electrical and Electronic Engineering, Nanyang Technological University (NTU), Singapore. He obtained his B.Tech. degree in instrumentation from Cochin University of Science and Technology, Kerala, India in 2002. During his degree he worked in the

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 97 — #57

98

Autonomous Mobile Robots

semi-conductor testing industry where he designed and developed equipment for the nondestructive testing of semiconductor devices. His current research interests include MMW RADAR sensor perception, RADAR signal processing, online feature detection, and autonomous navigation of land vehicles.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c002” — 2006/3/31 — 17:29 — page 98 — #58

3

Data Fusion via Kalman Filter: GPS and INS Jingrong Cheng, Yu Lu, Elmer R. Thomas, and Jay A. Farrell

CONTENTS 3.1 3.2

3.3

3.4

3.5

3.6

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Data Fusion — GPS and INS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Stochastic Process Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1.1 Computation of  and Qk . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Basic KF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2.1 Implementation issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Extended KF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GPS Navigation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 GPS Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Single-Point GPS Navigation Solution . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 KF for Stand-Alone GPS Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3.1 Clock model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3.2 Stationary user (P model) . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3.3 Low dynamic user (PV Model) . . . . . . . . . . . . . . . . . . . . 3.3.3.4 High dynamic user (PVA model) . . . . . . . . . . . . . . . . . . 3.3.3.5 GPS KF examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inertial Navigation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Strapdown System Mechanizations . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Error Model of INS System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3 EKF Latency Compensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Integration of GPS and INS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 INS with GPS Resetting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 GPS Aided INS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2.1 Loosely coupled system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2.2 Tightly coupled system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

100 101 102 102 104 105 107 109 113 113 115 119 120 121 122 122 123 127 128 129 131 131 132 133 133 134 135 142 99

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 99 — #1

100

Autonomous Mobile Robots

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Biographies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

3.1 INTRODUCTION Data fusion is the process of combining sensory information from different sources into one representational data format. The source of information may come from different sensors that provide information about completely different aspects of the system and its environment; or that provide information about the same aspect of the system and its environment, but with different signal quality or frequency. A group of sensors may provide redundant information, in this case, the fusion or integration of the data from different sensors enables the system to reduce sensor noise, to infer information that is observable but not directly sensed, and to recognize and possibly recover from sensor failure. If a group of sensors provides complementary information, data fusion makes it possible for the system to perform functions that none of the sensors could accomplish independently. In some cases data fusion makes it possible for the system to use lower cost sensors while still achieving the performance specification. Data fusion is a large research area with various applications and methods [1–3]. In addition to having a thorough understanding of various data fusion methods, it is useful to understand which methods most appropriately fit the corresponding aspects of a particular application. In many autonomous vehicle applications it is useful to dichotomize the overall set of application information into (internal) vehicle information and (external) environmental information. One portion of the vehicle information is the vehicle state vector. Accurate estimation of the vehicle state is a small portion of the data fusion problem that must be solved onboard an autonomous vehicle to enable complex missions; however, accurate estimation of the vehicle state is critical to successful planning, guidance, and control. When it is possible to analytically model the vehicle state dynamics and the relation between the vehicle state and the sensor measurements, the Kalman filter (KF) and the extended Kalman filter (EKF) are often useful tools for accurately fusing the sensor data into an accurate state estimate. In fact, when certain assumptions are satisfied, the KF is the optimal state estimation algorithm. The KF and its properties are reviewed in Section 3.2. This chapter has two goals. The first is to review the theory and application of the KF as a method to solve data fusion problems. The second is to discuss the use of the EKF for fusing information from the global positioning system (GPS) with inertial measurements to solve navigation problems for autonomous vehicles. Various fusion paradigms have been suggested in the literature — GPS,

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 100 — #2

Data Fusion via Kalman Filter

101

inertial navigation system (INS) only, INS with GPS resetting, INS with GPS position aiding (i.e., loose coupling), and INS with GPS range aiding (i.e., tight coupling). This chapter presents each approach and discusses the issues that are expected to affect performance. Discussion of latency, asynchronous, and nonlinear measurements are also included.

3.1.1 Data Fusion — GPS and INS For planning, guidance, and control applications it is critical that the state of the vehicle be accurately estimated. For these applications, the state vector of the vehicle includes the three-dimensional (3D) position, velocity, and attitude. Often, it is also possible to estimate the acceleration and angular rate. Various sensor suites and data fusion methods have been considered in the literature [4–8]. This chapter focuses on one of the most common sensor suites [9–11] — fusion of data from an inertial measurement unit (IMU) and a GPS receiver. The chapter considers the positive and negative aspects of various methods that have been proposed for developing an integrated system. An IMU provides high sample rate measurements of the vehicle acceleration and angular rate. An INS integrates the IMU measurements to produce position, velocity, and attitude estimates. INSs are self-contained and are not sensitive to external signals. Since an INS is an integrative process, measurement errors within the IMU can result in navigation errors that will grow without bound. The rate of growth of the INS errors can be decreased through the use of higher fidelity sensors or through sensor calibration. In addition, the INS errors (and calibrations) can be corrected through the use of external sensors. With a well-designed data fusion procedure, even an inexpensive INS can provide high frequency precise navigation information [12]. The rate of growth of the INS error will depend on the IMU characteristics and data fusion approach. A GPS receiver measures information that can be processed to directly estimate the position and velocity of the receiver antenna. More advanced multiantenna GPS approaches can also estimate the vehicle attitude [13–15]. The accuracy of the vehicle state estimate attained by GPS methods depends on the receiver technology and the processing method. Civilian nondifferential GPS users can attain position estimates accurate to tens of meters. Differential GPS users can attain position estimates accurate to a few meters. Differential GPS users capable of resolving carrier phase integer ambiguities can attain position estimates accurate to a few centimeters. The main disadvantage of state estimates determined using GPS is that the estimates are dependent on reception of at least four GPS satellite signals by the receiver. Satellite signal reception requires direct line of sight between the receiver and the satellite. While this line of sight is obstructed for a sufficiently large number of satellites, the GPS solution will not be available.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 101 — #3

102

Autonomous Mobile Robots

GPS and INS have complementary properties which have motivated various researchers to study methods to fuse the data from the two systems. The objective is to attain high performance for a higher percentage of the time than either approach could attain independently. This chapter uses GPS and INS to illustrate the use of the KF for data fusion. Section 3.2 reviews the KF, EKF, and various properties and application issues. Section 3.3 reviews the various issues related to the GPS system. Of particular interest will be various assumed dynamic models and issues affecting state estimation accuracy. Section 3.4 provides a brief review of INS and their error models. The main objective is to present the model information necessary to analyze alternative methods for combining GPS and INS information, which is done in Section 3.5.

3.2 KALMAN FILTER Since R. E. Kalman published his idea in the early 1960s [16,17], the KF has been the subject of extensive research. It has been applied successfully to solve many practical problems in different fields, particularly in the area of autonomous navigation. The KF involves two basic steps: use of the system dynamic model to predict the evolution of the state between the times of the measurements and use of the system measurement model and the measurements to optimally correct the estimated state at the time of the measurements. It is well known that the KF is recursive, computationally efficient, and optimal in the sense of the minimum mean of the squared errors [18]. This section contains three subsections. Section 3.2.1 reviews the linear dynamic system models that are required for the prediction and measurement update steps of the KF. Section 3.2.2 reviews the KF algorithm, a few of its properties, and methods to address various implementation issues. Section 3.2.3 reviews the EKF algorithm which is applicable when either the dynamic or measurement model of the system is not linear. The EKF is needed in GPS–INS data fusion applications since the INS dynamic model is nonlinear and the GPS measurement model may be nonlinear.

3.2.1 Stochastic Process Models Because the state of most physical systems evolve in continuous time, continuous-time dynamic models are of interest. The dynamic behavior of a linear continuous-time system driven by a random process ω(t) may be described mathematically by a set of ordinary differential equations: x˙ (t) = F(t)x(t) + G(t)ω(t)

(3.1)

y(t) = H(t)x(t) + v(t)

(3.2)

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 102 — #4

Data Fusion via Kalman Filter

103

where x(t) is the n-element state vector of the system, F(t) is the system matrix, G(t) is the input distribution matrix, y(t) is the measurement vector, H(t) is the measurement matrix, and v(t) is the measurement noise vector. The vectors ω(t) and v(t) are assumed to be white and Gaussian with E[ω(t)] = 0,

E[ω(t)ωT (t + τ )] = Qw (t)δ(τ )

(3.3)

E[v(t)] = 0,

E[v(t)v (t + τ )] = R(t)δ(τ )

(3.4)

T

where Qw is the power spectral density (PSD) of the white noise ω(t) and R(t) is the covariance matrix of the measurement noise process v(t). If either ω(t) or v(t) is not white, then it may be possible to append linear dynamics to the model of Equation (3.1) and Equation (3.2) to still utilize the model of a linear system driven by white noise, see Reference 19. For the state estimation design discussions of this chapter, unless otherwise stated, assume that the system model has been manipulated into the form of Equation (3.1) and Equation (3.2) with white process and measurement noise. In applications, such as those involving GPS, where the measurements occur at discrete instants of time, it is convenient to utilize a discrete-time formulation of the KF. If we denote the sequence of measurement times by t1 , . . . , tk , tk+1 , . . . , then implementation of the discrete-time KF requires a model for propagating the state between measurement times and a model for the relation between the state and the measurement that is valid at the measurement time. Using linear system theory [20,21], the state transition valid between tk and tk+1 is x(tk+1 ) = (tk+1 , tk )x(tk ) + w(tk )

(3.5)

where  w(tk ) =

tk+1

(tk+1 , τ )G(τ )ω(τ ) dτ

(3.6)

tk

and (tk+1 , t) is the state transition matrix from t to tk+1 . The measurement model valid at tk is y(tk ) = H(tk )x(tk ) + v(tk )

(3.7)

To simplify notation, these equations will be written as xk+1 = k xk + wk

(3.8)

yk = Hk xk + vk

(3.9)

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 103 — #5

104

Autonomous Mobile Robots

The process noise wk and measurement noise vk are assumed to be zero-mean, white noise with covariance properties as follows: 

Qk ,

k=j

0,

k = j

Rk ,

k=j

0,

k = j

E[wk wjT ] =  E[vk vjT ] =

E[wk vjT ] = 0,

for all k and j

(3.10)

(3.11) (3.12)

3.2.1.1 Computation of  and Qk The covariance matrix associated with w(tk ) is:  Qk = Q(tk+1 , tk ) =

tk+1

(tk+1 , τ )G(τ )Qw GT (τ )T (tk+1 , τ ) dτ

(3.13)

tk

For systems where F(t), G(t), and Qw (t) are accurately approximated as constant over the interval of integration, the transition matrix can be calculated by the inverse Laplace transform (tk+1 , tk ) = −1 {[sI − F]−1 }t=tk+1 −tk

(3.14)

Alternative methods to compute k and Qk use matrix exponentials [22,23] or Taylor series expansions. A common method for computing k is the truncated power series: ( t) = eF t = I + F t +

F3 t 3 F2 t 2 + + ··· 2! 3!

(3.15)

where t = tk+1 − tk and the choice of the order of the power series depends on the system design requirements. When F is time varying, it is necessary to subdivide t such that F can be considered as constant on the subintervals τi = τi − τi−1 whereτ0 = tk , N τN = tk+1 , and τi = τi−1 + τi for i = 1, . . . , N. Let t = i=1 τi N then k i=1 (τi , τi−1 ). The matrix Qk can be found by approximation techniques: Qk = Q(τN , τ0 )

(3.16)

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 104 — #6

Data Fusion via Kalman Filter

105

where by subdividing (3.13) into subintegrals and using (τi+1 , τ0 ) = (τi+1 , τi )(τi , τ0 ) we obtain Q(τi , τ0 ) = (τi , τi−1 )[GQw GT τi + Q(τi−1 , τ0 )]T (τi , τi−1 )

(3.17)

for i = 1, . . . , N with Q(τ0 , τ0 ) = 0. Example 3.1 Since a common GPS measurement epoch uses tk = k, this example considers computation of k and Qk over the unit interval t ∈ [k, k+1) where k is an integer. First, the unit interval is subdivided into N subintervals of length dτ = 1/N sec. Each subinterval is [τi , τi+1 ) where τi = k + idτ for i = 0, . . . , N. For small dτ , (τi+1 , τi ) = (I + F(τi )dτ ) and (τi+1 , τ0 ) = (τi+1 , τi )(τi , τ0 )

(3.18)

therefore, (τi+1 , τ0 ) = (τi , τ0 ) + F(τi )(τi , τ0 ) dτ Similarly, over each 1 sec interval, Qk = Qk (τN , τ0 ) can be integrated as follows: Qk (τ1 , k) = (τ1 , τ0 )GQw GT T (τ1 , τ0 ) dτ Qk (τ2 , k) = (τ2 , τ1 )[GQw GT dτ + Qk (τ1 , k)]T (τ2 , τ1 ) .. .

(3.19)

Qk (τN , k) = (τN , τN−1 )[GQw GT dτ + Qk (τN−1 , k)]T (τN , τN−1 )

3.2.2 Basic KF Since there are numerous books devoted to the derivation of the KF, such as References 19, 20, and 24, the derivation is not included herein. Instead, the KF algorithm and its properties are reviewed. The KF estimates the state of a stochastic system. To determine optimal gains for the filter at time tk, the KF compares the covariance of the state estimate at tk with the covariance of the measurement at tk . To enable this comparison, the KF algorithm will propagate the covariance of the state estimate as well as the state estimate. Prior to discussing the KF algorithm, it will be useful to summarize the new notation that will be used. The KF gain valid at time tk

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 105 — #7

106

Autonomous Mobile Robots

is Kk . The state estimate at time tj using all measurements up to time ti will be denoted by xˆ j|i . Therefore, xˆ k|k is the estimate of the state at time tk using all measurements up to and including yk , while xˆ k|k−1 is the estimate of the state at time tk using all measurements up to and including yk−1 . Similarly, Pk|k denotes the covariance of the state estimation error at time tk after using all measurements available up to and including yk , and Pk|k−1 denotes the covariance of the state estimation error at time tk after using all measurements available up to and including yk−1 . The KF algorithm is a recursive process. As such, it requires initialization prior to starting the recursion. Assume that the first measurement will occur at t1 and denote the initialized state estimate and its associated error covariance matrix as xˆ 0|0 and P0|0 . These initial values should be xˆ 0|0 = E(x0 ),

P0|0 = cov(x0 )

(3.20)

and k = 0. Often, it will be the case that P0|0 is diagonal with each element being large. The KF is implemented as follows: 1. Predict the state vector and error covariance matrix for the next measurement time: xˆ k+1|k = k xˆ k|k Pk+1|k =

k Pk|k Tk

(3.21) + Qk

(3.22)

Then, increment k = k + 1. 2. Calculate the KF gain matrix for incorporation of yk : Kk = Pk|k−1 HkT [Hk Pk|k−1 HkT + Rk ]−1

(3.23)

3. Use yk to correct xˆ k|k−1 : xˆ k|k = xˆ k|k−1 + Kk [yk − Hk xˆ k|k−1 ]

(3.24)

4. Compute the error covariance of the state estimate after incorporating yk : Pk|k = [I − Kk Hk ]Pk|k−1

(3.25)

where I is an n-dimensional identity matrix. Steps 1–4 are iterated for each new measurement. This iteration can continue ad infinitum. A few facts are important to point out. First, the discrete measurements have not been assumed to be equally spaced in time. The only

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 106 — #8

Data Fusion via Kalman Filter

107

assumption is that we have a model available, of the form of Equation (3.8) and Equation (3.9), suitable for accurate propagation of the state estimate and state estimate error covariance matrix between measurement instants. Second, Step 3 is the only step that requires the measurement; therefore, when the next measurement time can be accurately predicted, then Steps 1 and 2 are often computed prior to the arrival of the next measurement. The purpose of doing this is to minimize the computational delay between the arrival of yk and availability of xˆ k|k to the other online processes that need it (e.g., planning, guidance, or control). Third, the portions of the KF algorithm that require the majority of the computations are Equation (3.22), Equation (3.23), and Equation (3.25), which are related to maintaining the error covariance matrix and the Kalman gain. 3.2.2.1 Implementation issues The performance of the KF depends on the accuracy of the process model and the measurement model. The implementation approach also affects the performance and computational load of the KF. This section discusses some of the important implementation issues related to the KF. Sequential processing of independent measurements. When the system has m simultaneous, but independent measurements, the noise covariance matrix is diagonal: 

r1

 Rk =  0

0 .. .

0

0

0



 0 rm

(3.26)

In this case, it is computationally efficient to treat the measurements as sequential measurements. This replaces an m-dimensional matrix inversion with m scalar divisions. At time tk , we introduce an auxiliary vector xˆ 0 and matrix p0 which are initialized as p0 = Pk|k−1

and

xˆ 0 = xˆ k|k−1

(3.27)

The following recursion is performed for i = 1 to m: Ki =

pi−1 hiT ri + hi pi−1 hiT

xˆ i = xˆ i−1 + Ki [yi − hi xˆ i−1 ]

(3.28)

pi = [I − Ki hi ]pi−1

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 107 — #9

108

Autonomous Mobile Robots

where hi is the ith row of the measurement matrix Hk and yi is the ith element of the vector y. After the mth step of the recursion, the state and error covariance are xˆ k|k = xˆ m Pk|k = pm

(3.29)

Note that the state estimate xˆ k|k and error covariance Pk|k that result from this scalar processing will exactly match (within numerical error) the results that would have been obtained via the vector processing implementation. The gain matrices K that result from the vector and scalar processing algorithms will be distinct, due to the different order in which each implementation introduces the measurements. Rejection of bad measurements. In engineering applications, data does not always match theoretical expectations. Therefore, it is also necessary to set up some criteria to reject some measurements. For example, if for a scalar measurement yi the absolute value of the measurement residual res i = yi − hi xˆ i−1 at time k is sufficiently larger than its

standard deviation hi Pk|k−1 hiT + r, then the measurement could be ignored. In this case, this kth measurement would be missed. Such situations are discussed below. Missed measurements. Sometimes an expected measurement may be missing. One circumstance under which this could occur was discussed earlier. When a measurement is missing, the “measurement” contains no information; therefore, the uncertainty of the measurement is infinite (i.e., R = αI with α = ∞). In this case, by Equation (3.23), Kk = 0. Using this fact, in Steps 3 and 4 of the KF, yields xˆ k|k = xˆ k|k−1

(3.30)

Pk|k = Pk|k−1

(3.31)

The missed measurement has no effect on the estimated state or its state error covariance matrix. Divergence of the KF. The KF is the optimal state estimator for the modeled system. The KF is stable if certain technical assumptions, including observability and controllability from the process noise vector are met [19–21]. Lack of observability, absence of controllability from the process noise vector, or modeling error can cause the KF state estimate to diverge from the true state. These are issues that must be studied and addressed at the design stage.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 108 — #10

Data Fusion via Kalman Filter

109

Tuning. Ideally, the KF is applied to a well-modeled dynamic system with stochastic process noise and measurement noise satisfying the required assumptions. In such cases, the Q and R matrices can be computed correctly as a portion of the stochastic model. In some other applications, examples of which will occur in Section 3.3.3, the vector ω represents unknown factors that may not be truly random. In such applications, Q and R are often used as performance tuning parameters. As Q is decreased relative to R, the KF trusts the dynamic model of the system more than the measurements; therefore, the states of the system converge more slowly since new information is weighted less. If Q is increased relative to R, the measurements will be weighted more and the states will converge faster; however, the measurement noise will have a larger effect on the accuracy of the filtered solution. Note that in applications where Q and R are used as performance tuning parameters, all stochastic interpretations of Pk|k are lost. Instead, the KF is being used as an algorithm to estimate the state, but the KF optimality properties are not applicable. Maintaining symmetry. The equation Pk|k = [I − Kk Hk ]Pk|k−1

(3.32)

Pk|k = [I − Kk Hk ]Pk|k−1 [I − Kk Hk ]T + Kk Rk KkT

(3.33)

is a simplified version of

Equation (3.32) is valid only when Kk is the optimal Kalman gain matrix. When Kk is defined by an equation other than Equation (3.23) and is not the KF optimal gain matrix, then Equation (3.33) should be used. Since Pk|k is the error covariance matrix, it should be symmetric and positive semidefinite. Although Equation (3.33) requires more computational operations than Equation (3.32) does, Equation (3.33) is a symmetric equation. However, the symmetry of either result can be guaranteed and the computational requirements are decreased by only computing the lower diagonal half of Pk|k . Maintaining definiteness. Neither Equation (3.32) nor Equation (3.33) guarantees that Pk|k is symmetric or positive semidefinite in the presence of numeric errors. One possible solution is to factorize Pk|k (e.g., P = UDUT or P = QR) and derive algorithms that propagate the factors directly. Such factorized algorithms [20,21] have better numeric stability properties, especially in applications where computational error is an issue.

3.2.3 Extended KF The previous sections have discussed only linear systems with zero-mean, white Gaussian process, and measurement noise. The optimality properties of the KF

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 109 — #11

110

Autonomous Mobile Robots

under these assumptions were briefly discussed in the previous sections. In many applications either the measurement model, the system dynamics, or both are nonlinear. In these cases the KF may not be the optimal estimator. Nonlinear estimation is a difficult problem without a general solution. Nonlinear estimation methods are discussed, for example, in References 25 and 26. For the navigation systems which are the main focus of this chapter, the EKF has proved very useful because the linearized dynamic and measurement models are accurate for the short periods of time between measurements. Due to its utility in the applications of interest, the EKF is reviewed in this section. Such navigation systems can be described by the nonlinear stochastic differential equation x˙ (t) = f(x, u, t) + g(x, t)w (t)

(3.34)

with a measurement model of the form y(t) = h(x, t) + v (t)

(3.35)

where f is a known nonlinear function of the state x, the signal u, and time; g is a known nonlinear function of the state and time; and w and v are continuoustime white noise processes. For its covariance propagation and measurement updates, the EKF will use a linearization of Equation (3.34) and Equation (3.35). The linearization is performed relative to a reference trajectory x∗ (t) which is a solution of x˙ ∗ (t) = f(x∗ , u, t) between the measurement time instants. The corresponding reference measurement is y∗ (t) = h(x∗ , t) The error state vector is defined as δx = x − x∗ and the measurement residual vector as δy(t) = y(t) − y∗ (t) The linearized dynamics of the error state vector are δ x˙ (t) = F(t)δx(t) + G(t)w (t) + ex (t)

(3.36)

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 110 — #12

Data Fusion via Kalman Filter

111

and the linearized (residual) measurement model is δy(t) = H(t)δx(t) + v (t) + ey (t)

(3.37)

where ∂f F(t) = , ∂x x=x∗

∂h H(t) = , ∂x x=x∗

G(t) = g(x∗ , t)

and ex (t), ey (t) are linearization error terms. From Equation (3.36) and Equation (3.37), the equivalent model for discrete measurements is δxk+1 = k δxk + wk δyk = Hk δxk + vk where the state transition matrix and process noise covariance matrix can be calculated by the methods given in Section 3.2.1. The approximation will hold only for a short period of time and only if the reference trajectory is near the actual trajectory. For the systems that are the focus of this chapter, the linearization will occur around the computed navigation state. Time propagation occurs between GPS measurement epoches, which are typically separated by only a few seconds. Measurements at a given epoch are assumed to occur simultaneously. The purpose of the GPS corrections is to keep the navigation state near the state of the true system. Implementation of the EKF algorithm is very similar to that of the KF, in fact, only the state propagation and measurement prediction steps will change. In addition, the P matrices computed in the algorithm are no longer true covariance matrices; although, we will still use that name in the following text. To initialize the EKF algorithm, assume that the first measurement will occur at t1 and denote the initialized state estimate, residual state estimate, and its associated error covariance matrix as xˆ 0|0 , δ xˆ 0|0 , and P0|0 , respectively. These initial values should be xˆ 0|0 = E(x0 ),

δ xˆ 0|0 = 0,

P0|0 = cov(x0 )

Since this is a nonlinear estimation process, it is important that x(0) − xˆ 0|0 be small. The equations and procedures for the EKF are summarized as follows: 1. Propagate the state estimate to the next measurement time tk+1 by integrating x˙ ∗ (t) = f(x∗ , u, t)

(3.38)

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 111 — #13

112

Autonomous Mobile Robots

over the time interval t ∈ [tk , tk+1 ] with initial condition x∗ (tk ) = xˆ k|k . At the completion of the integration, let xˆ k+1|k = x∗ (tk+1 ). Along the solution x∗ (t), compute ∂f (x) F(t) = ∂x x=x∗

and

G(t) = g(x∗ , t),

for t ∈ [tk , tk+1 ]

2. Compute the state transition matrix k and compute the process noise covariance matrix Qk . Predict the error state vector and error covariance matrix: δ xˆ k+1|k = k δ xˆ k|k = k 0 = 0

(3.39)

Pk+1|k = k Pk|k Tk + Qk

(3.40)

The reason that δ xˆ k|k is set to 0 in (3.39) is clarified in the discussion following (3.43). 3. Increment k = k + 1. 4. Linearize the measurement matrix at x∗ (tk ) and calculate the EKF gain matrix: Hk = H(tk ) = Kk =

∂h ∂x x=ˆxk|k−1

Pk|k−1 HkT [Hk Pk|k−1 HkT

(3.41) + Rk ]

−1

5. Compute the error states using the residual measurements: δ xˆ k|k = δ xˆ k|k−1 + Kk [δyk − Hk δ xˆ k|k−1 ] = Kk δyk

(3.42)

where δ xˆ k|k−1 is the error state vector estimated prior to the new measurements, which by Equation (3.39) is zero. 6. Update the estimated states xˆ k|k : xˆ k|k = xˆ k|k−1 + δ xˆ k|k

(3.43)

Since the error state has been included in the state estimate, the error has been corrected; therefore, the new best estimate of the error state is zero. Therefore, δ xˆ k|k must be set to zero: δ xˆ k|k = 0. 7. Update the posterior error covariance matrix: Pk|k = [I − Kk Hk ]Pk|k−1

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 112 — #14

Data Fusion via Kalman Filter

113

The EKF iterates Steps 1 to 7 for the duration of the application. Steps 1 and 2 perform computations related to time propagation of the state and the matrix P. Steps 4 to 7 perform computations related to the measurement update. In the EKF algorithm, the computation, use, resetting, and time propagation of δ xˆ k|k often causes confusion. The above algorithm is a total state implementation. In an alternative error state implementation of the algorithm, Step 6 could be removed. Without Step 6, Equation (3.39) of Step 2 would have to be implemented to time propagate the error state and the simplification to Equation (3.42) of Step 5 would not be possible. In this alternative implementation, it is possible that, over time, δ xˆ k|k could become large. In this case, x∗ is not near the actual state. In this case, the linearized equations may not be accurate. The EKF algorithm as presented (using Step 6) includes δ xˆ k|k in x∗ resulting in a more accurate linearization. The total and error state implementations are discussed in greater detail in References 20 and 27.

3.3 GPS NAVIGATION SYSTEM The purpose of this section is to discuss the advantages and disadvantages of various EKF approaches to state estimation using GPS measurements. Section 3.3.1 presents background information about GPS that is necessary for the subsequent discussions. Section 3.3.2 discusses position estimation based on GPS measurements. The EKF approaches to solving the GPS equations are compared in Section 3.3.3.

3.3.1 GPS Measurements The GPS is designed to provide position, velocity, and time estimates to users at all times, in all weather conditions, anywhere on the Earth. The existing GPS signal for each satellite consists of a spectrum spreading code and data bits modulated onto a carrier signal. By accurately measuring the transit time of the code signal, the receiver can form a measurement of the pseudorange between the satellite and the receiver antenna. This measurement is referred to as a pseudorange as it is also affected by receiver and satellite clock errors. By processing the data bits to determine the clock error model and ephemeris data, the receiver can compute the satellite position and clock errors as a function of time. Tracking the satellite signal requires that the receiver acquire either frequency or phase lock to the satellite carrier signal. Phase information from the tracking loop has utility as an additional range measurement and the change in the phase measurement over a known period of time (referred to in the GPS literature as a Doppler measurement) can be used to estimate the receiver velocity. The GPS satellites broadcast signals on two frequencies: L1

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 113 — #15

114

Autonomous Mobile Robots

and L2. Users with “two frequency” receivers can obtain pseudorange, phase, and Doppler measurements for each of the two frequencies. The L1 and L2 code and carrier phase measurements from a given satellite can be modeled as f2 Ia + Ecm + MP1 + η1 f1 f1 = R + bu + c tsv + Ia + Ecm + MP2 + η2 f2

ρ˜L1 = R + bu + c tsv + ρ˜L2

f2 Ia + Ecm + mp1 + n1 f1 f1 φ˜ L2 λ2 + N2 λ2 = R + bu + c tsv − Ia + Ecm + mp2 + n2 f2 φ˜ L1 λ1 + N1 λ1 = R + bu + c tsv −

where R = Xsv − Xu  is the geometric distance between the satellite position Xsv and receiver antenna position Xu , bu is the receiver clock bias, and c tsv is the satellite clock bias. The satellite clock bias can be partially corrected by ephemeris data. Ecm represents common errors other than dispersive effects such as ionosphere and Ia represents ionospheric error. The symbols η and n represent receiver measurement noise. The symbols mp and MP represent errors due to multipath. Note that the receiver clock bias is identical across satellites for all simultaneous pseudorange and phase measurements. Since the receiver phase lock loops can only track changes in the signal phase and the initial number of carrier wavelengths at the time of signal lock is not known, each phase signal is biased by an unknown constant integer number of carrier cycles represented by N1 and N2 . Use of the phase measurements as pseudorange signals for position estimation also requires estimation of these unknown integers [28–32]. Use of the change in the phase over a known period of time to estimate the receiver velocity does not require estimation of these integers, since the integers are canceled in the differencing operation [33,34]. The standard GPS texts [34,35] include entire sections or chapters devoted to the physical aspects of the various quantities that have been briefly defined in this section. Note that only R and bu contain the position and receiver clock information that we wish to estimate. The symbols c tsv , Ia , Ecm , MP, mp, η, and n all represent errors that decrease the accuracy of the estimated quantities. There are many techniques to reduce these measurement errors prior to the navigation solution. Dual frequency receivers can take advantage of the code measurements from L1 and L2 to estimate the ionospheric delay error Ia as Ia =

f1 f2 (ρ˜L1 − ρ˜L2 ) f22 − f12

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 114 — #16

Data Fusion via Kalman Filter

115

Due to the differencing of measurements, this estimate is noisy; since Ia changes very slowly, it can be low-pass filtered to remove the noise. For measurements from single frequency receivers, it is possible to compensate part of the ionospheric delay errors by an ionospheric delay model [36]. Alternatively, differential operation using at least two receivers can effectively remove all common mode errors (i.e., c tsv , Ia , Ecm ). The methods discussed in the subsequent sections can be used for the pseudorange or integer-resolved carrier phase measurements. We will not discuss Doppler measurements. To avoid redundant text for the code and integerresolved carrier measurements, we will adopt the following general model for the range measurement to the ith satellite ρ˜i =

(Xi − x)2 + (Yi − y)2 + (Zi − z)2 + bu + εi

(3.44)

where ρ˜ could represent the code pseudorange measurements or integerresolved carrier phase measurements. The variable bu represents the receiver clock bias. The symbol ε represents the error terms appropriate for the different measurements. When a GPS receiver has collected range measurements from four or more satellites, it can calculate a navigation solution.

3.3.2 Single-Point GPS Navigation Solution This section presents the standard GPS position solution method using nonlinear least squares. In the process, we will introduce notation needed for the subsequent sections. In this section, the state vector is defined as x = [x, y, z, bu ]T where (x, y, z) is the receiver antenna position in earth centered earth fixed (ECEF) coordinates and bu is the receiver clock bias. Taylor series expansion of Equation (3.44) about the current state estimate xˆ = [ˆx , yˆ , zˆ , bˆ u ] yields ρ˜i (x) = ρˆi (ˆx) + [hi , 1]δx + h.o.t.s + εi where δx = x − xˆ = [x − xˆ , y − yˆ , z − zˆ , bu − bˆ u ]T ρ˜i (ˆx) = (Xi − xˆ )2 + (Yi − yˆ )2 + (Zi − zˆ )2 + bˆ u

 ∂ρi ∂ρi ∂ρi , , hi = ∂x ∂y ∂z (ˆx,ˆy,ˆz)

(3.45)

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 115 — #17

116

Autonomous Mobile Robots

and −(Xi − x) ∂ρi = 2 ∂x (Xi − x) + (Yi − y)2 + (Zi − z)2 −(Yi − y) ∂ρi = ∂y (Xi − x)2 + (Yi − y)2 + (Zi − z)2 −(Zi − z) ∂ρi = ∂z (Xi − x)2 + (Yi − y)2 + (Zi − z)2 Given m simultaneous range measurements, all the measurements can be put in the matrix form δρ = Hδx + v

(3.46)

by making the definitions  ρ1  ρ2    δρ =  .   .. 





and

 h1 , 1  h2 , 1    H= .   .. 

ρm

(3.47)

hm , 1

where the residual measurement is ρi = ρ˜i (x) − ρˆi (ˆx) and v represents the high order terms (h.o.t.s) of the linearization plus the measurement noise. To determine the state vector, a minimum of four simultaneous range measurements are required. The weighted least squares solution to Equation (3.46) is δx = [HT R−1 H]−1 HT R−1 δρ

(3.48)

The corrected position estimate is then xˆ + = xˆ + δx

(3.49)

To reduce the effects of the linearization error terms, the above process can be repeated using the same measurement data and the corrected position at the end of the current iteration as the starting point of the next iteration (i.e., xˆ = xˆ + ). The iteration is stopped when the error state vector δx converges to a sufficiently small value. Even after the convergence of δx has been achieved,

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 116 — #18

Data Fusion via Kalman Filter

117

there may still be significant error ηx = x − xˆ + between the actual state and the best estimate after incorporating the measurements. The measurement noise covariance matrix is 

σ12  Rρ = cov(v) =  ... 0

0 ···

··· .. .

0

0

σm2

  

(3.50)

The value of σi2 for the ith satellite could be determined based on time-series analysis of measurement data, the S/N ratio determined in the tracking loop for that channel, or computed based on satellite elevation. The covariance of ηx is Rx = cov(ηx ) = [HT R−1 H]−1

(3.51)

It is important to note that this matrix is not diagonal. Therefore, errors in the GPS position estimates at a given epoch are correlated. The above solution approach can be repeated (independently) for each epoch of measurements. This calculation of the position described so far, at each epoch, results in a series of single-point solutions. At each epoch, at least four simultaneous measurements are required and the solution is sensitive to the current measurement noise. There is no information sharing between epochs. Such information sharing between epochs could decrease noise sensitivity and decrease the number of satellites required per epoch; however, information sharing across epochs will require use of a dynamic model. Section 3.3.3 discusses advantages and disadvantages of alternative models and EKF solutions for GPS-only solutions. Section 3.4 discusses methods for combining GPS and IMU data. Example 3.2 Throughout the remainder of this chapter we will extend the example that begins here. The example will be analyzed in 2 . By this we mean that we are analyzing a 2D world, not a 2D solution in a 3D world. We restrict the analysis to a 2D world for a few reasons (1) the analysis will conveniently fit within the page constraints of this chapter; (2) graphical illustrations are convenient; and (3) several important theoretical issues can be conveniently illustrated within the 2D example. The main conclusions from the 2D example have exact analogs in the 3D world (discussed in Example 3.6). In a 2D world, p(t), v(t) ∈ 2 and there is a single angular rotation angle ˙ ψ(t) ∈  with ω(t) = ψ(t) ∈ . All positions and ranges will be in meters. All angles are measured in degrees. The quantities ψ and ω are not used in this example, but are defined here for completeness as they are used in Example 3.6.

© 2006 by Taylor & Francis Group, LLC

FRANKL: “dk6033_c003” — 2006/3/31 — 16:42 — page 117 — #19

118

Autonomous Mobile Robots

To find the position corresponding to range measurements in the 2D example, we define the position and clock bias error vector as δp = x − xˆ = [x − xˆ , y − yˆ , bˆu − bˆ u ]T The range is computed as ρˆi (ˆx) = (Xi − xˆ )2 + (Yi − yˆ )2 + bˆ u The line-of-sight vector (from satellite to user) is  hi =



−(Xi − xˆ ) (Xi − xˆ )2 + (Yi − yˆ )2

,

−(Yi − yˆ )



(Xi − xˆ )2 + (Yi − yˆ )2

(3.52)

Because there are three unknowns, measurements from at least three satellites will be required. Let us assume that there are satellites at locations  sin θi  ◦ ◦ ◦ ◦ Pi = 10 × 106 cos θi m for θ1 = 90 , θ2 = 85 , θ3 = 20 , and θ4 = −85 with corresponding range measurements of ρ1 = 9.513151e6, ρ2 = 9.469241e6, ρ3 = 9.363915e6, and ρ4 = 10.468545e6. Then, if the initial position estimate is xˆ = [0.00, 0.00, 0.00]T , the sequence of positions and position corrections computed by iterating Equation (3.48) and Equation (3.49) with R = I, is shown in Table 3.1. Note that if the initial estimate, possibly obtained by propagation of the estimate from a previous epoch, was accurate to approximately 10 m, then one or possibly two iterations would provide convergence of a new estimate consistent with the measurements of the current epoch to better than millimeter accuracy. Also, even after the estimate of x has converged to micrometer accuracy, the error in the estimated measurement is still 0.44 m. This is the least squared error that can be achieved by adjusting the three elements of x to fit the four measurements of ρ.

TABLE 3.1 Results of Computations for Example 3.2 Iteration 0 1 2 3 4

δx

δx



ˆ ρ − ρ

NA [5.01, 5.09, 0.14]e5 [−0.12, −0.91, −1.36]e4 [0.01, −4.29, −4.21]e0 [−0.26, −8.53, −9.29]e−7

NA 7.1e5 1.6e4 6.0e0 1.3e−6

[0, 0, 0] [5.011961, 5.090871, 1.364810]e5 [5.000000, 5.000