Skip to main content
AI Agents & Automation

⏱ About 20 min20 XP

Parallel vs. Sequential Agents

Every multi-agent system must answer a fundamental scheduling question: should these agents run one at a time, or simultaneously? The answer is not a preference — it flows directly from the logical structure of the task. Some subtasks can only begin once another subtask has finished; those must be sequential. Other subtasks have no dependency on each other; those are candidates for parallel execution. Getting this analysis right is the difference between a system that finishes in minutes and one that takes hours for no good reason, and between a system that produces coherent results and one that produces contradictory ones.

Reading the Dependency Graph

The fundamental tool for deciding parallel vs. sequential execution is the task dependency graph, also called a directed acyclic graph (DAG) in computer science. Nodes are subtasks. A directed edge from node A to node B means 'B cannot start until A is complete.' Tasks with no incoming edges can begin immediately. Tasks with multiple incoming edges must wait for all predecessors. Consider a system that produces a research-backed business plan. The dependency structure might look like this: (1) Identify market size (no dependencies — starts immediately). (2) Research competitors (no dependencies — starts immediately in parallel with task 1). (3) Analyze financial assumptions (depends on tasks 1 and 2 — must wait for both). (4) Draft the executive summary (depends on task 3). This graph prescribes a clear execution schedule: tasks 1 and 2 run in parallel; task 3 starts only when both complete; task 4 starts only when task 3 completes. A sequential-only system would take four time units; the optimal schedule takes three, because tasks 1 and 2 overlap. For any multi-agent task, drawing this graph before writing any code reveals the theoretical minimum runtime (the length of the longest dependency chain, called the critical path) and shows which agents can be parallelized without affecting correctness.

Amdahl's Law Applied to Agents

Amdahl's Law from parallel computing states that the speedup from parallelism is limited by the fraction of the task that must remain sequential. If 50% of your agent pipeline is on the critical path and cannot be parallelized, adding more parallel agents can at most double your speed — no matter how many you add. Maximize parallelism by designing for minimal sequential dependencies.

When Sequential Is the Right Choice

Sequential execution is not a concession — it is often the correct design. Sequential agents are appropriate in three main scenarios. First, genuine data dependency: the output of step N is literally required as input for step N+1. There is no parallelism available because the logic demands ordering. A pipeline that (a) extracts data from a document, then (b) validates the extracted data, then (c) stores valid data to a database is inherently sequential and should be designed that way. Second, context refinement: each agent in the sequence enriches or narrows the context so that the next agent can operate more precisely. An agent that first classifies a customer intent, then retrieves relevant policy documents based on that classification, then drafts a response using those documents — each step is more focused because of the preceding step. Running these in parallel would require each agent to operate on raw, unclassified input. Third, safety and auditability: in high-stakes domains (financial transactions, medical recommendations, legal filings), sequential pipelines with explicit checkpoints between steps allow human review at defined moments. Parallel pipelines are harder to checkpoint because the state of the system is spread across simultaneously executing agents.

When Parallel Is the Right Choice

Parallel execution is appropriate when subtasks are genuinely independent — each can be completed without any result from the others — and when the latency savings justify the additional coordination complexity. Common scenarios include: Fan-out information gathering: a supervisor dispatches research tasks to multiple agents simultaneously (e.g., look up this company in five different databases at once) and aggregates results once all return. The tasks are fully independent; parallel execution is almost always the right choice. Redundant computation for reliability: running the same task on two or more agents simultaneously and selecting the result that the majority agree on (quorum voting) or the result that passes a validation check. This trades additional compute cost for higher reliability and is used in high-stakes pipelines. Multi-perspective analysis: generating multiple independent analyses of the same input (e.g., a legal agent, a financial agent, and a risk agent all analyze the same merger proposal simultaneously) and then presenting all perspectives to the integration stage. Each agent's reasoning is uncontaminated by the others' conclusions, which is often desirable.

Match each scenario to the correct execution pattern and the reason why.

Terms

Extracting, then validating, then storing a document
Checking a company's financial, legal, and technical status simultaneously
Running two agents on the same medical diagnosis for quorum voting
Classifying intent, then retrieving policy, then drafting a response
Translating a document into ten languages at once

Definitions

Parallel — the three analyses are fully independent and can proceed concurrently
Sequential — each agent narrows context for the next, making it more precise
Parallel redundant — simultaneous independent results improve reliability
Sequential — each step requires the prior step's output as its input
Parallel — each translation is independent and has no dependency on the others

Drag terms onto their definitions, or click a term then click a definition to match.

Parallelism Creates Coordination Problems

Every parallel branch must eventually converge. The aggregation step — waiting for all parallel agents, handling stragglers, reconciling conflicting outputs — is genuinely complex. Do not parallelize tasks whose outputs are hard to reconcile. The coordination cost can exceed the latency benefit.

A directed graph shows the dependency relationships among subtasks and determines which can run in parallel. The path is the longest dependency chain in the graph and sets the minimum possible runtime. Running the same task on multiple agents simultaneously to select the most reliable result is called computation. A task that cannot start until two other tasks are both complete is said to have dependencies.

A data pipeline has four steps: (A) download raw data, (B) clean the data, (C) run statistical analysis, (D) generate a visualization. Each step depends on the previous. A developer proposes running A, B, C, and D in parallel to speed up the pipeline. What is wrong with this proposal?

An AI news aggregation system must (1) fetch headlines from 50 different news sites, (2) de-duplicate the headlines, (3) rank them by relevance. What is the optimal execution strategy?

Draw the Execution DAG

  1. You are designing an automated grant application review system for a foundation. The system must evaluate each application across four independent dimensions — financial health of the applicant, alignment with foundation priorities, feasibility of the proposed project, and strength of the team — and then produce a final holistic recommendation.
  2. Step 1: List all the agent tasks in the system.
  3. Step 2: Draw the dependency graph (boxes and arrows). Which tasks have no dependencies and can start immediately? Which must wait?
  4. Step 3: Identify the critical path. What is the minimum number of sequential steps this pipeline requires?
  5. Step 4: For each agent, decide: specialist with narrow focus, or generalist? Why?
  6. Step 5: The foundation adds a requirement: a human program officer must review the holistic recommendation before it is sent to the applicant. Where does this checkpoint fit in your DAG, and does it change the critical path?
  7. Present your DAG to the class and defend your parallelism decisions.