Skip to main content
Robotics & Embodied AI

⏱ About 20 min20 XP

Localization and SLAM

Navigation requires two things: a map and a position within it. On a first visit to an unknown building, a robot has neither. This is the SLAM problem — Simultaneous Localization and Mapping — and it is one of the most studied and practically important problems in robotics. The robot must build a map of the environment using its sensors while simultaneously using that incomplete, growing map to figure out where it is. The two tasks are deeply entangled: you need a map to localize, but you need to know your location to build an accurate map.

Localization When the Map Is Known

When a map is already available, the problem simplifies to localization: given a map and sensor readings, estimate the robot's pose (position and orientation) within that map. This is still a hard estimation problem, but the map provides a rich source of information. Monte Carlo Localization (MCL), also called particle filter localization, is the dominant practical algorithm. It maintains a set of weighted particles, each representing a hypothesis about the robot's pose. The predict step moves each particle forward according to the motion model (with noise). The update step weights each particle by how well the sensor reading it would predict matches the actual sensor reading, given the map. Particles that explain the observations well receive high weight; unlikely particles receive low weight. Resampling focuses particles on the high-probability regions. MCL can solve the 'kidnapped robot problem': a robot that is picked up and placed somewhere new. The particle distribution initially covers many hypotheses; as the robot observes, the consistent particles survive and the distribution collapses to the correct location. This global uncertainty is the distinctive strength of particle-based localization over Kalman-based approaches, which require a good initial guess. For a concrete example: the Willow Garage PR2 robot navigating an office used AMCL (Adaptive Monte Carlo Localization) with a pre-built 2D grid map. Its lidar scan was compared against the map at each particle's hypothesized pose; particles consistent with the observed scan shape survived. After a few seconds of motion in a recognizable corridor, the distribution collapsed to sub-centimeter positional uncertainty.

The SLAM Chicken-and-Egg Problem

In SLAM, the map estimate depends on the trajectory estimate (you need to know where you were when you took each observation to place it correctly on the map) and the trajectory estimate depends on the map (landmarks in the map help you know where you are). Neither can be solved independently. SLAM algorithms handle this circular dependency by maintaining joint uncertainty over both the map and the trajectory, updating them together.

The SLAM problem is formalized as estimating the joint distribution over the robot's trajectory X = {x_1, x_2, ..., x_t} and the map M, given all sensor observations Z and control inputs U: p(X, M | Z, U) This is a high-dimensional inference problem. Exact inference is intractable for any environment of practical size. All practical SLAM algorithms approximate it. Graph-based SLAM is now the dominant approach. The robot's trajectory is represented as a graph of pose nodes connected by edges. Each edge encodes a constraint between poses — derived from odometry (motion between consecutive poses) or from loop closure (recognizing a previously visited place and constraining the current pose to the past pose where that place was seen). Optimizing the graph means finding the trajectory that best satisfies all these constraints simultaneously, minimizing the total error across all edges. Loop closure detection is the critical sub-problem. When the robot revisits a location it has mapped previously, it must recognize the revisit and create a loop closure edge. This collapses accumulated drift — the error that builds up in the map as small odometry errors accumulate over a long trajectory. Without loop closure, even a high-quality odometry system will produce a map that 'drifts' — corridors that should close into a loop appear open or misaligned. Moroccan maze analogy: imagine drawing a map of a maze while blindfolded, using only your footsteps to estimate distance and direction. After walking for ten minutes, your footstep counting has accumulated enough error that your map's end does not connect to its beginning. The moment you recognize a doorway you passed before — loop closure — you can correct the whole map: you know exactly where you were then and where you are now, so you can adjust every intermediate estimate.

Match each SLAM component to its precise function.

Terms

Loop closure detection
Pose graph optimization
Monte Carlo Localization
Odometry edge
Place recognition

Definitions

A constraint between consecutive pose nodes derived from wheel encoders or IMU, reflecting motion uncertainty
Matching current sensor observations against a database of past observations to identify previously visited locations
Tracking robot pose in a known map using a set of weighted particle hypotheses
Recognizing a previously visited place to create a constraint that collapses accumulated drift in the map
Finding the robot trajectory that minimizes constraint violation across all odometry and loop closure edges

Drag terms onto their definitions, or click a term then click a definition to match.

Visual SLAM vs. Lidar SLAM

LiDAR SLAM (e.g., Cartographer, LOAM) uses geometric point cloud matching for precise pose estimation but requires expensive sensors. Visual SLAM (e.g., ORB-SLAM3, COLMAP) uses camera images and feature point tracking — cheaper hardware but more sensitive to lighting and texture. Monocular visual SLAM cannot recover metric scale without additional information (like IMU integration or known object sizes). Each choice carries hardware cost, robustness, and accuracy tradeoffs.

A mobile robot using lidar SLAM is mapping a long, featureless corridor. After traveling 50 meters, the map shows a slightly curved corridor when the corridor is actually straight. The most likely cause is:

A robot's SLAM system detects a false loop closure — it incorrectly identifies a new location as one it has visited before and inserts a wrong constraint into the pose graph. After graph optimization, what will happen to the map?

Trace the SLAM Process Through an Exploration Scenario

  1. A robot is given the task of mapping an unknown square room (approximately 10m x 10m) with two internal doorways.
  2. Step 1: Describe the robot's state at the start of mapping: what is its pose uncertainty? What does the map contain?
  3. Step 2: The robot moves forward 2 meters, guided by odometry. Describe how the pose uncertainty changes and why.
  4. Step 3: The robot's lidar observes a distinctive corner at the far end of the room. Describe how this observation updates the robot's pose estimate.
  5. Step 4: The robot explores the room for 3 minutes, traveling 15 meters total. Describe qualitatively what the accumulated trajectory error looks like by this point.
  6. Step 5: The robot passes through doorway 1 and then through doorway 2, returning to its starting corner. It recognizes the corner. Describe: what is a loop closure, what happens to the pose graph when this closure is detected, and what happens to the map?
  7. Step 6: After optimization, the map is shared with a second robot that must navigate using it. What information does the second robot need from the map to localize itself when it enters the room?