Front End Engineer
June 25, 2020
Computer Vision Engineer
October 28, 2020

Multisensor Fusion for Robust Egomotion

Egomotion estimation is an essential building block of any autonomous navigation system. This is because it enables an autonomous agent to know its position and orientation as it navigates in its environment.


According to Forbes, the COVID-19 epidemic has added to a rising trend of online shopping. This puts a lot of pressure on warehouse companies who are turning to autonomous mobile robots (AMRs) for greater efficiency. Forbes also reports that AMR sales are expected to rise to $27 billion by 2025. This demand for automation has driven many companies across the world to increase the development of robotic platforms for warehouse automation, store assistance, delivery and more. Similar growth is expected for the autonomous vehicle industry in general, where projections indicate a near 10-fold increase in market size from 2019 to 2026. These two growing industries require both safe and reliable autonomous navigation.

Egomotion estimation is an essential building block of any autonomous navigation system. This is because it enables an autonomous agent to know its position and orientation as it navigates in its environment. Many studies show that accurate and reliable egomotion estimation can be achieved by combining data from multiple sensors in a sensor fusion framework. This is especially important for applications where environmental conditions may render individual sensors ineffective. As a result, the only way to achieve robustness is by using multiple sensors. Despite this, mature, configurable, “out-of-the-box” commercial sensor fusion solutions for egomotion estimation are still lacking.

Robust and accurate egomotion estimation and sensor fusion are two of Univrses’ core competences. These capabilities have been developed as components in 3DAI™ Engine. The  3DAI™ Engine is our modular software system that comprises all the necessary components to enable autonomy. In this article, we present a selection of the methods and tools that we have developed to address sensor fusion for motion estimation

Why is using a single sensor pipeline not enough?

Before considering sensor fusion approaches in any robotics or autonomous system application, it is important to assess how many and which type  of sensors to use. This will also be particularly relevant when combining different sensors together. 

Adding more sensors to an autonomous platform means higher per unit cost. For example, a top of the line light imaging detection and ranging (LiDAR) sensor for self-driving vehicles can cost up to $75000. Even if the cost of sensors would decrease, more sensors increase the weight of the system, take up more space and consume more power. Robots and autonomous vehicles are expected to operate for long periods of time. Multiple sensors on the same platform will consume available battery life more quickly resulting in reduced endurance. As a result, estimating egomotion using a single sensor modality may actually have some advantages. However, each sensing modality may not achieve the required levels of robustness and accuracy due to vulnerability to various environmental conditions. 
First, consider the estimation of egomotion using a camera. Cameras are cheap and readily available. In addition, monocular (single-camera) visual odometry pipelines, such as the one we offer in our 3DAI™ Engine, can reliably estimate egomotion in many cases. However, using a camera alone might not be robust enough in situations where:

  • the camera field of view is heavily blocked
  • there is poor lighting (e.g. nighttime and tunnels)
  • there is reduced visibility (e.g. sun glare)

Second, there is LiDAR egomotion, which is used in many robotics platforms. LiDAR-based odometry can estimate egomotion with high degree of accuracy. 
However, LiDARs:

  • are expensive
  • have smaller operating distance compared to Radars
  • are less effective when operating in heavy fog, snow, and rain
  • can generate poor egomotion estimates when there are very few static structures (e.g. warehouse isles with little structure)

Third, the Radar is an alternative sensor modality for egomotion estimation. Radars perform well under variable lighting, atmospheric and other conditions that would cause other sensors, such as cameras, LiDAR and GPS sensors, to fail. Additionally, due to long wavelength and beam spreading, Radar sensors return multiple readings from which they detect stable and long-range features in the environment. 

However, using Radar to estimate egomotion is challenging because Radars:

  • have lower angular accuracy than LiDARs 
  • may not distinguish between multiple objects that are placed very close to each other (e.g. two small vehicles  near each other could be seen as one large vehicle)
  • are susceptible to high levels of noise (especially in cluttered environments)
  • can create “ghost” objects due to receiver saturation and multipath reflections

Finally, less complex sensor modalities such as wheel encoders and inertial measurement units (IMUs) are often used for egomotion estimation. Both allow for high sampling rates and can provide accurate short-term estimations. However, position estimates tend to deteriorate over time. Noisy inertial measurements cause egomotion estimates to drift away from the true position whilst wheel odometers can incorrectly infer motion from a slipping or skidding wheel. 

The combined limitations of single-sensor egomotion pipelines make a good case for combining data from multiple sensors to achieve greater robustness under a wider range of conditions. Sensor fusion achieves this by employing algorithms that combine data from several sensors to make egomotion estimation more accurate and more dependable compared to using any individual sensor alone.

How to fuse multiple sensor data?

The general principle behind sensor fusion is to find the best value for the variables we want to estimate (e.g. a vehicle’s position and orientation) given noisy observations from multiple sensors. Several methods and frameworks exist for sensor fusion.

Kalman filter variants are historically the most commonly used methods for sensor fusion. In a Kalman filter, the variables we want to estimate form the state of the system. In the case of egomotion, a state usually includes the vehicle’s position and orientation, although velocity, acceleration and other information may also be included. 

In its original form, the Kalman filter is only optimal when three crucial conditions are met:

  • The system is linear. This is problematic because most real-life applications are nonlinear in nature, therefore this condition limits the practical use of the filter.
  • Sensor noise in the system follows Gaussian (or Normal) distributions. This assumes that most measurements will be close to a mean value, and only a few measurements will be extreme in value. The requirement to meet this condition renders the filter less effective in situations where noise does not follow this pattern (i.e.“non-Gaussian”).
  • The current sensor measurements and the last state estimate contain all the information needed to calculate an estimate for the current state of the system. Therefore, the Kalman filter does not need to keep the entire history of past state estimates and sensor measurements, which reduces memory footprint and computational time. However, it also makes it hard to include relevant sensor measurements  that arrive late and to correct for possible erroneous measurements that could affect the estimate for the current state. 

Kalman filter extensions and alternatives

Several nonlinear extensions have been proposed over the last six decades to address the limitation of linearity and enable the use of the filter in many real-life applications. For example, the Extended Kalman Filter (EKF) has a proven track-record as a sensor fusion framework for trajectory estimation going back to the NASA Apollo missions. Other extensions include the Unscented Kalman Filter (UKF) and the Cubature Kalman Filter (CKF), which are used to address applications with higher degree of nonlinearity. 

To address non-Gaussian noise, researchers invented the particle filter algorithm in the early 1990s. The particle filter is particularly suited to deal with non-Gaussian noise, and that allows it to be employed in applications where Kalman-filter based methods underperform. However, the computational time for a Particle Filter increases sharply as more variables are included in the system’s state (this is also known as the curse of dimensionality). 
To overcome many of the drawbacks inherent to Kalman filters, researchers have developed 0ptimization-based approaches. More specifically, while Kalman-based methods usually only estimate the current state of the system, optimization-based approaches simultaneously estimate the current state and correct for errors across previous state estimations.  In addition, these approaches can be combined with compact and flexible ways to represent the history of state estimations, making it easy to incorporate delayed measurements. While optimization-based methods are powerful and accurate, it is critical to account for time sensitive applications. For example, in real-time systems only limited optimization would be considered (e.g. window-based optimization) due to time constraints.

Univrses’ solutions

At Univrses, we have developed both filter-based and optimization-based approaches for different sensor fusion applications. These solutions have been used to give autonomous systems a more accurate and robust estimate of their position. More recently, they have been deployed on smartphones as part of 3DAI City, Univrses’ smart city data platform.

Our optimization-based solution offers several enticing features, for example, it:

  • provides highly accurate estimations, especially when used in non-time critical applications where the algorithm is allowed to run for a long time. For this reason, it can be used to generate ground-truth position and orientation for a vehicle
  • employs a compact, flexible and optimized representation of the history of state estimations that allows for easy association between measurements and states at all times, and thus
    • handles delayed measurements with extreme ease
    • enables the simultaneous use low and high output rate sensors
  • allows for easy inclusion and exclusion of sensors, which is important in scenarios where a subset of sensors may become unreliable or unavailable during navigation 
  • finds where the fused sensors are with respect to each other and in relation to the vehicle. This adds to our calibration capability, which also features a state-of-the-art approach that was recently recognised in a CVPR publication  (for further information, see our CVPR 2020 paper here).  

Combined, our flexible optimization-based sensor fusion approach provides accurate and reliable egomotion that is suitable for diverse applications. Therefore, we believe that our sensor fusion technology offers great value for projects seeking robust localization for navigation or other tasks. Please get in touch to find out more.