🧡 Most People Overcomplicate This System Design Question

β€œDesign a pipeline: Process 1 β†’ Process 2 β†’ Final Output”

I got asked this. Here’s everything I learned β€” including where I went wrong, what the interviewer was actually probing, and the mental models that finally made it click.


First: What the Interviewer Is Really Testing

Before you touch queues or workers, the interviewer expects you to ask:

Is Process 2 dependent on Process 1 for the same item?
Does order matter?
Can Process 1 and 2 run in parallel on different items?
What happens on failure or retry?

Most people skip this. That’s the first mistake.

Good system design starts with clarifying the problem, not announcing a solution.


My Initial Answer β€” What Was Right

My instinct was: β€œUse a queue-based, write-heavy architecture. Assign workers to each process.”

That’s not wrong.

Queue-based architecture is the correct category here. Users initiate requests, messages are stored and queued, and workers pull messages from the queue, process them, and store results. The core benefits:

  • Backpressure handling
  • Retry and failure isolation
  • Scalability
  • Write-heavy friendliness

So the instinct was right. The problem was the next part.


Where It Got Weak β€” Single Queue

I said: β€œSingle queue, with workers assigned to the processes.”

The interviewer pushed: β€œSingle or multiple queues?”

I said single queue. And that’s where I under-explained.

A single queue without explicit state management implies:

Queue
 β”œβ”€β”€ Job A (P1? P2? Who knows?)
 β”œβ”€β”€ Job B
 └── Job C

Workers must:
  - Check which stage the job is in
  - Branch their logic
  - Manage ordering carefully
  - Risk running P2 before P1

This works β€” but it pushes complexity into code, not architecture.


What the Interviewer Wanted to Hear

Option A (The Expected Answer): Multi-Queue Pipeline

Queue_P1 β†’ Workers_P1 β†’ Queue_P2 β†’ Workers_P2 β†’ Output

This is the textbook answer, and it’s textbook for good reason.

Pipeline architecture breaks work into stages β€” like an assembly line. Each stage is independent: it reads from an input queue, transforms data, and writes to an output queue. While Stage 1 processes new incoming events, Stage 2 is processing the previous batch simultaneously, improving throughput.

Why interviewers love multi-queue:

Where is the job?     β†’ Which queue it's in tells you
What stage is it in?  β†’ The queue IS the stage
How do you retry?     β†’ Retry within that queue only
How do you scale?     β†’ Add workers per queue independently

The insight that finally clicked:

Multi-queue = workflow in architecture. Single queue = workflow in code.

Option B: Single Queue β€” But Only with Explicit State

If you say single queue, you must immediately add the state model:

{
  "job_id": "abc123",
  "stage": "PROCESS_1"
}

Worker logic:

Pull message
Check stage
β†’ If PROCESS_1: do P1 β†’ re-enqueue with PROCESS_2
β†’ If PROCESS_2: do P2 β†’ finalize

This works. But now:

  • Workers are smarter (more complex)
  • Failures are harder to isolate
  • Observability requires extra tooling

The Queue vs. Scheduler Question

Midway through, I asked: β€œCan we use a scheduler for this?”

Good question to ask. Wrong tool for this use case.

In an event-driven pipeline, the pipeline processes events immediately as they occur. A scheduler is designed for time-based execution β€” running code at fixed times or intervals.

Your pipeline is event-driven, not time-driven:

Queue  β†’ reacts to events    βœ… for Process 1 β†’ Process 2
Scheduler β†’ reacts to time   βœ… for retries, batch, nightly jobs

Where a scheduler makes sense alongside the pipeline:

Queue_P1 β†’ Workers β†’ Queue_P2 β†’ Workers β†’ Output
                 ↑
        Scheduler (retries + batch jobs only)

When to use the scheduler:

  • Re-run failed jobs after N minutes
  • Batch expensive LLM calls at off-peak hours
  • Nightly reprocessing jobs

Pipelines use queues. Schedulers are for time-based edge cases.


The Scaling Question β€” Where I Was Right

Later I said: β€œIf Process 1 needs more computation, I’d give more workers to that process.”

That’s correct. But it answered a different layer of the question.

The interviewer was asking about user visibility β€” how does the user know it’s happening?

I was answering infrastructure.

Here’s how to connect both:

"Because each stage can scale independently, I'd also expose
 per-stage progress to the user β€” for example, showing that
 Process 1 is complete and Process 2 is still running."

During a big sale, if thousands of orders come in per minute, they just queue up and all the downstream services process at whatever rate they can. The user immediately gets an order confirmation page because the front-end isn’t waiting on all those processes to finish.

That’s the user experience model to follow.


User Visibility β€” The Follow-Up Question

After the pipeline design, they asked:

β€œHow would you tell the user this is being done?”

This is about state exposure, not queues.

Step 1: Create a Job record when user triggers the request
        POST /generate-file β†’ returns job_id

Step 2: Update job status as workers complete stages

        QUEUED
          ↓
        PROCESS_1_RUNNING (30%)
          ↓
        PROCESS_1_DONE
          ↓
        PROCESS_2_RUNNING (70%)
          ↓
        COMPLETED (100%)

Step 3: User polls or receives push updates

        GET /jobs/{job_id}/status
        β†’ { "status": "PROCESS_2_RUNNING", "progress": 70 }

Three valid approaches:

Approach When to use
Polling Most common, simplest, always acceptable
WebSockets / SSE Long-running tasks, real-time UX matters
Notification on completion Very long jobs (email, push notification)

What candidates miss β€” always mention failure:

{
  "status": "FAILED",
  "error": "Process 2 timed out",
  "retry_available": true
}

You never expose queues to users. You expose state.


Why Multi-Queue Simplifies Everything

The key insight β€” visually:

Single queue:

  GroceryOrder ──┐
  FoodOrder ────────▢ Workers check stage β†’ branch logic β†’ manage state
  MedicineOrder β”€β”˜    (workflow lives in code)


Multi-queue:

  Queue_P1 β†’ Workers_P1 (do one thing) β†’ Queue_P2 β†’ Workers_P2
             (workflow lives in architecture)

Design your pipeline to handle failures gracefully without compromising data integrity β€” implement retries, dead-letter queues, and circuit breakers.

With multi-queue, all of this becomes stage-isolated:

  • Failure in P2 β†’ retry within Queue_P2 only
  • P1 is untouched
  • Dead-letter queue per stage
  • Scale each stage independently

The Trade-Off Table (What Interviewers Want)

Β  Multi-Queue Single Queue
Complexity In architecture In code
Retries Stage-isolated Manual state management
Scaling Per-queue, obvious Filtered by worker type
Observability Queue depth = progress Needs external state
Infra overhead Higher Lower
Good for Complex pipelines Simple, 2-stage flows

Start with multi-queue. Move to single queue only when infra cost matters more than clarity.


The Interview-Ready Answer (Say This Next Time)

β€œI’d default to multiple queues forming a pipeline β€” one for Process 1 and one for Process 2. This keeps stages isolated, simplifies retries, and lets me scale each step independently based on its resource profile.

A single queue could work if each message carries explicit stage metadata, but that moves complexity into worker logic and makes failure handling harder.

For user visibility, I’d track job state in a DB, expose a status endpoint, and let the frontend poll or listen via WebSocket.”

That answer covers infrastructure, trade-offs, and UX in one go.


Key Takeaways

1. Queues move work
   State tracks progress
   Users see state β€” not queues

2. Multi-queue = workflow in architecture
   Single queue = workflow in code

3. Pipelines use queues (event-driven)
   Schedulers are for time-based work

4. Scale workers per stage, not globally

5. Always mention:
   β†’ Retries
   β†’ Dead-letter queues
   β†’ Idempotency
   β†’ User-facing status

Common Mistakes to Avoid

❌ Using a scheduler for real-time pipeline steps
❌ Single queue without explaining stage/state
❌ Not mentioning failure handling
❌ Answering infra without connecting to user experience
❌ Over-engineering before clarifying the problem

Further Reading


If you’re preparing for system design interviews β€” the gap is rarely knowledge. It’s connecting infrastructure decisions to user experience. That’s the layer interviewers are always probing.