Skip to main content
Robotics & Embodied AI

⏱ About 20 min20 XP

Computation and the Software Stack

Between the sensor data arriving and the motor command going out, a robot must compute. The hardware that runs those computations and the software layers that organize them are collectively the robot's computational subsystem. Getting this subsystem wrong does not just produce slow code — it produces a robot that misses critical deadlines, behaves unpredictably under load, or fails in ways that are nearly impossible to reproduce or debug. Understanding the computational stack from silicon to middleware is what separates a robot that works on a demo day from one that works reliably in the field.

Onboard Compute Hardware

The right compute hardware depends on the robot's task profile: how much sensor data must be processed, how fast decisions must be made, and how much power is available. Microcontrollers (MCUs) are single-chip computers with a processor, memory, and peripheral interfaces on one die. They run at tens to hundreds of MHz, consume milliwatts to hundreds of milliwatts, and operate without an operating system — code runs in a bare-metal or RTOS (real-time OS) environment. Arduino, STM32, and the Raspberry Pi RP2040 are examples. MCUs are ideal for low-level motor control, sensor reading, and safety-critical tasks that demand deterministic timing. Single-board computers (SBCs) like the Raspberry Pi 5 or NVIDIA Jetson Orin Nano run a full Linux OS on a processor comparable to a mid-range smartphone. They support Python, ROS 2, machine learning inference, and USB/Ethernet connectivity. Power draw ranges from 5 W to 30 W. They are the backbone of most research and mid-range commercial robots. FPGAs (Field-Programmable Gate Arrays) are reconfigurable logic arrays that implement custom digital circuits. They offer deterministic timing down to nanoseconds and can process sensor streams (lidar returns, camera pixels) with far lower latency than any software-based approach. The trade-off is development complexity — FPGA firmware is written in hardware description languages (VHDL, Verilog) and is hard to debug. GPU-equipped edge accelerators (NVIDIA Jetson AGX Orin, Qualcomm Snapdragon) combine a CPU with a powerful GPU or neural-network accelerator on the same board. Running deep-learning perception models (object detection, depth estimation) in real time on a robot requires this class of hardware. Power draw runs 15–60 W, making it feasible only in larger, well-powered robots.

Heterogeneous Compute Is the Norm

Production robots rarely use a single processor. A typical architecture has a microcontroller running hard-real-time motor control at 1 kHz, an SBC running ROS 2 navigation and perception, and a GPU accelerator running a deep learning model for object detection — all communicating over internal buses and shared memory.

Real-Time Operating Systems

A standard operating system like Linux schedules processes to maximize throughput and fairness — it will pause a running process to give CPU time to another process based on priority policies that are not temporally precise. This is fine for a web server, where a 10 ms scheduling jitter is invisible. It is catastrophic for a motor controller, where a missed 1 ms update deadline can cause a joint to exceed its safe velocity. A real-time operating system (RTOS) provides guaranteed scheduling: a task declared to be highest priority will receive CPU time within a bounded, deterministic latency called the scheduling jitter. FreeRTOS, Zephyr, and VxWorks are widely used RTOSes in robotics. PREEMPT-RT is a Linux kernel patch that converts standard Linux into a soft real-time OS — sufficient for many robotics applications that need near-real-time guarantees without the full complexity of a hard-RTOS. Real-time systems are categorized as hard or soft. A hard real-time system fails if any deadline is missed — automotive airbag controllers, fly-by-wire aircraft systems, and safety monitors on industrial robots are hard real-time. A soft real-time system degrades gracefully when deadlines are occasionally missed — a navigation planner that sometimes takes 12 ms instead of 10 ms produces a slightly jerky trajectory but does not crash the robot.

Match each computational component to its defining role in the robot software stack.

Terms

Microcontroller (MCU)
FPGA
GPU edge accelerator
Hard real-time system
ROS 2 node

Definitions

Guarantees that every deadline is met or the system declares failure
A software process that publishes or subscribes to typed data topics over a middleware bus
Implements custom digital logic for nanosecond-latency sensor stream processing
Runs deep-learning perception models in real time with high parallel compute throughput
Runs bare-metal or RTOS code for deterministic low-level motor control

Drag terms onto their definitions, or click a term then click a definition to match.

Robotics Middleware: ROS 2

Writing every communication pathway between a robot's software components from scratch would be a massive engineering burden. Robotics middleware provides a standardized communication infrastructure. ROS 2 (Robot Operating System 2) has become the dominant open-source middleware in research and commercial robotics. In ROS 2, a robot's software is decomposed into nodes — individual processes, each performing a focused task. A lidar driver node reads from the lidar hardware and publishes sensor_msgs/LaserScan messages. A localization node subscribes to those messages and publishes a nav_msgs/Odometry estimate. A path planner subscribes to the odometry and publishes geometry_msgs/Path messages. A motor controller node subscribes to path commands and drives the actuators. Each node can be developed, tested, and restarted independently. Nodes communicate through three mechanisms: Topics are asynchronous publish/subscribe channels (like a news feed). Services are synchronous request/response calls (like a remote function call). Actions are long-running tasks with feedback and cancellation support (like 'navigate to waypoint X and report progress'). ROS 2 is built on top of DDS (Data Distribution Service), an industry-standard middleware protocol also used in autonomous vehicles, submarines, and medical devices. This gives ROS 2 configurable quality-of-service: a safety-critical message can be delivered reliably and in order, while a camera image stream may tolerate some dropped frames.

A motor controller that runs at 1 kHz (one update every 1 ms) is deployed on a standard Linux computer running other processes. The control loop occasionally misses its 1 ms deadline by 5–10 ms when the OS scheduler runs other processes. What is the most appropriate technical solution?

In a ROS 2 system, a navigation planner node needs to ask the localization node for the robot's current pose and wait for the answer before computing the next path. Which ROS 2 communication mechanism is most appropriate?

Design a Compute Architecture

  1. You are selecting the computational subsystem for a warehouse sorting robot. The robot must: (1) drive autonomously through a warehouse using lidar-based navigation at 10 Hz, (2) recognize packages using a camera-based object detector running a small neural network at 15 frames per second, (3) control four drive motors with hard-real-time updates at 500 Hz, and (4) communicate with a fleet management server over WiFi.
  2. Step 1: List the compute tasks and categorize each as hard real-time, soft real-time, or non-real-time.
  3. Step 2: Propose a hardware architecture. How many processors? What types? Which tasks run on which hardware?
  4. Step 3: Which communication mechanism in ROS 2 (topic, service, or action) would you use to: (a) stream lidar data from the driver to the navigation node, (b) ask the fleet manager to assign the next delivery, (c) command the robot to navigate to a waypoint and report when it arrives?
  5. Step 4: Estimate the total power draw of your compute hardware. Is it within a reasonable power budget for a robot that must run for an 8-hour warehouse shift?
  6. Step 5: Identify the single biggest risk in your architecture and describe how you would mitigate it.