Skip to main content
Robotics & Embodied AI

⏱ About 20 min20 XP

Manipulation and Grasping

Picking up a cup seems trivial. You do it dozens of times a day without conscious thought. But consider what is actually happening: your visual system estimates the cup's pose and geometry; your brain plans a grasp that accounts for the cup's weight, fragility, and likely contents; your hand moves to a pre-grasp position; your fingers close with just enough force to secure the cup without crushing it; and throughout the lift, tactile sensors in your fingertips monitor slip and adjust grip force continuously. Reproducing this capability in a robot is one of the hardest open problems in robotics. Manipulation — physically interacting with and reshaping the arrangement of objects in the world — is where robotics meets its deepest challenges.

Contact Mechanics and Grasp Quality

A grasp is a set of contacts between robot fingers and an object that constrains the object's motion. Contact mechanics describes how forces are transmitted through each contact point. The key concept is friction: a contact point can exert a force along the contact normal (pushing) and, up to a limit, along the contact tangent (friction). This friction limit is described by the friction cone: the set of all force directions that can be sustained at a contact without slipping. If the applied force direction lies inside the friction cone, the contact holds. Outside the cone, the finger slips. A grasp is said to achieve force closure if the set of contacts can apply an arbitrary wrench (force + torque) to the object. A force-closed grasp can resist disturbances from any direction — the object cannot be pulled, pushed, or twisted out of the hand. Force closure is the gold standard for reliable manipulation. Grasp quality metrics quantify how well a grasp can resist disturbances. The epsilon quality metric measures the largest wrench ball that fits inside the set of achievable wrenches — intuitively, how much force in the worst direction the grasp can resist before slipping. High-quality grasps have contacts distributed around the object with large friction cones in critical directions. Grasp synthesis algorithms take object geometry as input and output a set of contact points (or finger configurations) that achieve a high-quality grasp. Analytic approaches compute force closure from contact geometry exactly. Learning-based approaches train on large datasets of grasp attempts to predict grasp success probability directly from sensor data — typically point clouds from depth cameras.

The Sim-to-Real Gap in Manipulation

Manipulation policies trained in simulation frequently fail on real hardware because contact physics — friction, deformation, surface texture, object compliance — is difficult to simulate accurately. Small errors in simulated friction coefficients produce very different grasp outcomes in reality. Bridging this gap requires domain randomization (randomizing physics parameters in simulation), careful sim calibration, and often some amount of real-world fine-tuning.

End effectors — the robot's 'hands' — vary enormously in design, and the right choice depends heavily on the task. Parallel-jaw grippers have two flat jaws that close symmetrically. They are simple, reliable, and force-controlled. They work well on objects with parallel flat surfaces (boxes, cylinders) but struggle with irregular shapes. They are the dominant gripper in industrial pick-and-place. Three-finger grippers (like the Robotiq 2F-140 or the Barrett Hand) add a third degree of flexibility, enabling power grasps (wrapping around objects) and precision pinch grasps (fingertip contact for delicate objects). They cover a wider range of object geometries. Dexterous hands, like the Shadow Dexterous Hand with 24 degrees of freedom, replicate human hand anatomy. They can manipulate objects within the hand — rolling a pen, turning a key — without setting them down. But they are mechanically complex, expensive, fragile, and extremely difficult to control. In-hand manipulation remains a frontier research problem. Soft grippers, made from compliant silicone or pneumatic chambers, use their physical compliance to conform passively to irregular object shapes without complex grasp planning. They are robust to uncertainty in object pose and shape. Their weakness is that they provide less precise force control and lower grasping strength than rigid grippers.

Flashcards — click each card to reveal the answer

Sensing and Perception for Manipulation

Grasping requires perceiving the object: its position, orientation, geometry, and surface properties. Modern manipulation pipelines typically combine multiple sensing modalities. Depth cameras (structured-light or time-of-flight) produce point clouds — 3D maps of visible surfaces. Algorithms like PointNet or GraspNet process point clouds directly to predict grasp poses or contact points. Pose estimation algorithms localize known objects (estimating their 6-DOF pose from the point cloud) to enable more precise grasp planning. Tactile sensing is the sense robots most conspicuously lack relative to humans. Human fingertips have about 2,500 tactile receptors per square centimeter, providing rich information about contact force, texture, slip, and object hardness. Most commercial grippers have no tactile sensing at all. Research tactile sensors use arrays of pressure-sensitive elements (capacitive or piezoelectric), embedded cameras filming deforming gel pads (GelSight), or barometric chambers. With tactile feedback, robots can detect incipient slip (a slight reduction in friction that precedes a drop) and increase grip force just enough to prevent it — a reflex humans perform unconsciously. The closed-loop grasp control pipeline is: (1) visual estimation of object pose, (2) grasp synthesis to compute finger placement, (3) arm motion to pre-grasp configuration, (4) finger closing with force control, (5) tactile monitoring during the lift, (6) grip adjustment in response to detected slip or load changes.

A robot grasps a cylindrical bottle using a parallel-jaw gripper. During the lift, the bottle begins to slowly slide downward in the gripper even though the jaws are applying constant force. What is happening physically, and what is the correct control response?

A robotics team wants to grasp a large variety of irregularly shaped produce items (apples, peppers, pears) in a food-processing facility. Which end effector is most appropriate and why?

Grasp Quality Analysis

  1. In this activity you will analyze grasp quality for simple 2D objects by reasoning about contact forces and friction cones.
  2. Part 1 — Friction cone concept: Draw a contact point on a flat surface. The contact normal points upward (perpendicular to the surface). The coefficient of friction is mu = 0.5. The friction cone half-angle is arctan(mu) = approximately 26.6 degrees. Draw the friction cone by drawing lines at +26.6 and -26.6 degrees from the normal. Any force direction within this cone can be applied without slip.
  3. Part 2 — Parallel-jaw grasp on a box: Draw a rectangle representing a box (10 cm x 6 cm). Draw a parallel-jaw gripper grasping the 6 cm sides (jaws on the left and right). For each contact, draw the friction cone (use mu = 0.5). Is this grasp force-closed? Can it resist a downward pull (gravity) on the box? Can it resist a torque trying to rotate the box clockwise? Explain your reasoning.
  4. Part 3 — Compare grasp positions: Now draw the same box but with the gripper grasping off-center (one jaw 1 cm above center, the other 1 cm below). Draw the friction cones. Does moving the contact points off-center change the grasp quality? Why?
  5. Part 4 — Fragile object: Suppose the box is a fragile glass container that will crack if the normal contact force exceeds 20 N, but gravity pulls it with 15 N. Given mu = 0.5, what is the minimum grip force needed to prevent vertical slip? Is this below the fracture limit? What does this tell you about the required friction coefficient for grasping fragile heavy objects?