Command Palette

Search for a command to run...

0
Blog
PreviousNext

Gall's Law in Software Development: Why Simple Systems Win

Understanding why complex systems that work always evolved from simple systems that worked—and why you can't skip that evolution.

I've watched countless projects fail because developers tried to build the "perfect" system from day one. Then I discovered Gall's Law, and everything clicked.

What is Gall's Law?

John Gall, in his 1975 book "Systemantics," stated:

"A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system."

Let me show you why this matters more than any design pattern you'll ever learn.

The Microservices Trap: A Real Story

A startup I consulted for wanted to "do things right from the start." Their plan for a basic task management app:

Planned Architecture (Day One):
├── User Service (Go)
├── Task Service (Node.js)
├── Notification Service (Python)
├── Analytics Service (Rust)
├── API Gateway (Kong)
├── Message Queue (RabbitMQ)
├── Service Mesh (Istio)
├── Kubernetes Cluster
└── Distributed Tracing (Jaeger)

Six months later: They had zero users, mounting AWS bills, and a team burned out from dealing with distributed systems complexity.

What actually worked:

// Week one: Simple Next.js app
// app/api/tasks/route.ts
import { prisma } from "@/lib/db";
 
export async function GET(request: Request) {
  const tasks = await prisma.task.findMany();
  return Response.json(tasks);
}
 
export async function POST(request: Request) {
  const data = await request.json();
  const task = await prisma.task.create({ data });
  return Response.json(task);
}

They shipped in two weeks. Got real users. Found actual problems. Then evolved.

Why We Overengineer

Before diving into solutions, let's understand why we fall into this trap:

The Resume-Driven Development:

  • "I need Kafka on my CV"
  • "Everyone's using Kubernetes"
  • "This will make me look senior"

The Future-Proofing Fallacy:

  • "We'll need to scale to millions"
  • "What if we need multiple databases?"
  • "Better build it right the first time"

The Hacker News Effect:

  • Seeing Netflix's architecture and thinking you need the same
  • Reading about problems you don't have
  • Solutions looking for problems

I've been guilty of all three. Here's what I learned.

The Simple System Ladder

Instead of building complexity upfront, I now follow this progression:

Level One: Monolith with SQLite

Start here. Yes, really.

// lib/db.ts
import Database from "better-sqlite3";
 
const db = new Database("app.db");
 
export function getTasks(userId: string) {
  return db.prepare("SELECT * FROM tasks WHERE user_id = ?").all(userId);
}
 
export function createTask(userId: string, title: string) {
  return db
    .prepare("INSERT INTO tasks (user_id, title) VALUES (?, ?)")
    .run(userId, title);
}

When to stay here:

  • Less than 1,000 users
  • Less than 100,000 database rows
  • Single developer or small team
  • MVP or proof of concept

Real talk: SQLite can handle millions of rows. Your startup probably won't get there.

Level Two: Monolith with Postgres

Migrate when you actually need it:

// lib/db.ts
import { Pool } from "pg";
 
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
});
 
export async function getTasks(userId: string) {
  const result = await pool.query("SELECT * FROM tasks WHERE user_id = $1", [
    userId,
  ]);
  return result.rows;
}

When to migrate:

  • Multiple servers need database access
  • You need better tooling or ecosystem
  • You're actually experiencing SQLite limitations
  • Team is comfortable with Postgres

Don't migrate because:

  • Someone said "SQLite isn't production-ready"
  • You read a blog post
  • It feels more "professional"

Level Three: Modular Monolith

Break internal concerns into modules, not services:

// features/tasks/service.ts
export class TaskService {
  constructor(private db: Database) {}
 
  async getTasks(userId: string) {
    return this.db.task.findMany({ where: { userId } });
  }
 
  async createTask(userId: string, data: CreateTaskDto) {
    // Validation
    // Business logic
    // Database operation
    return this.db.task.create({ data: { ...data, userId } });
  }
}
 
// features/notifications/service.ts
export class NotificationService {
  async notify(userId: string, message: string) {
    // Send notification
  }
}
 
// features/tasks/api.ts
import { taskService } from "./service";
import { notificationService } from "@/features/notifications/service";
 
export async function createTask(userId: string, data: CreateTaskDto) {
  const task = await taskService.createTask(userId, data);
  await notificationService.notify(userId, "Task created!");
  return task;
}

Benefits:

  • Clear boundaries
  • Easy to test
  • Simple to refactor
  • One deployment
  • No network calls

When to use:

  • Multiple developers
  • Growing codebase
  • Need clear separation
  • Not ready for microservices

Level Four: Extract Critical Services

Only when you have evidence:

// When notifications are slowing down task creation...
 
// tasks-service/src/api/tasks.ts
export async function createTask(userId: string, data: CreateTaskDto) {
  const task = await db.task.create({ data });
 
  // Queue notification instead of blocking
  await queue.publish("notifications", {
    type: "task.created",
    userId,
    taskId: task.id,
  });
 
  return task;
}
 
// notification-service/src/worker.ts
queue.subscribe("notifications", async (message) => {
  await sendEmail(message.userId, message.taskId);
});

When to extract:

  • Service has different scaling needs
  • Different deployment cycles needed
  • Performance bottleneck identified
  • Team size justifies complexity

Real-World Evolution: How Stripe Did It

Stripe didn't start with microservices. Here's their actual evolution:

2010 - Simple Rails App:

# One file did everything
class ChargesController < ApplicationController
  def create
    charge = Stripe::Charge.create(
      amount: params[:amount],
      currency: 'usd',
      source: params[:token]
    )
    render json: charge
  end
end

2012 - Modular Monolith:

# Separated concerns, same codebase
module Billing
  class ChargeService
    def create_charge(params)
      # Logic here
    end
  end
end
 
module Fraud
  class DetectionService
    def check_transaction(charge)
      # Logic here
    end
  end
end

2014 - Strategic Service Extraction: Only after identifying bottlenecks, they extracted specific services like fraud detection and webhooks.

Key insight: They waited until they had real data about what needed to scale independently.

The Cost of Complexity

Let me quantify what premature complexity costs:

Development Speed

Simple system:

// Change takes 5 minutes
export async function createTask(data: CreateTaskDto) {
  return db.task.create({ data });
}

Complex system:

// Same change takes 2 hours
// 1. Update task service
// 2. Update API gateway config
// 3. Update message schemas
// 4. Update consumer services
// 5. Deploy five services
// 6. Monitor distributed traces
// 7. Debug cross-service issues

Cognitive Load

Simple system:

  • One codebase to understand
  • Straightforward debugging
  • Clear error messages
  • Local testing works

Complex system:

  • Multiple repos to navigate
  • Distributed debugging
  • Network errors vs. logic errors
  • Requires full infrastructure to test

Infrastructure Costs

My SaaS app comparison:

Simple (Monolith on single server):

Vercel Hobby: $0/month
Postgres: $25/month
Total: $25/month for 10,000 users

Complex (Microservices):

Kubernetes cluster: $200/month
RabbitMQ: $50/month
Redis: $30/month
Monitoring: $50/month
Service mesh: $100/month
Total: $430/month for 10,000 users

ROI of complexity: Negative until you hit scale.

How to Apply Gall's Law Practically

Here's my decision framework:

Starting a New Project

// ✅ Do this first
const app = {
  framework: "Next.js",
  database: "SQLite or Postgres",
  hosting: "Vercel or single VPS",
  architecture: "Monolith",
  queue: "None (add when needed)",
  cache: "None (add when needed)",
};
 
// ❌ Don't do this first
const overengineered = {
  services: ["user", "auth", "api", "worker", "analytics"],
  databases: ["Postgres", "Redis", "MongoDB"],
  infrastructure: ["Kubernetes", "Istio", "Kafka"],
  monitoring: ["Prometheus", "Grafana", "Jaeger"],
};

When You Actually Need Complexity

Ask these questions before adding complexity:

Is this a real problem or a hypothetical one?

// ❌ Hypothetical
"We might need to scale to millions of users";
 
// ✅ Real
"We have 500,000 users and the database is at 80% CPU";

Have you measured it?

// ❌ No data
"This endpoint feels slow";
 
// ✅ Has data
"P95 latency is 3s, target is 500ms, profiler shows DB query takes 2.8s";

Can you solve it simply first?

// ❌ Jump to microservices
"Let's extract this to a separate service";
 
// ✅ Try simple solutions
"Let's add an index / cache this query / optimize the query";

Will it reduce complexity elsewhere?

// ✅ Good reason
"Extracting email service lets us deploy notifications without touching payments";
 
// ❌ Bad reason
"Microservices are best practice";

Patterns for Simple Systems

Here are patterns I use to keep systems simple:

The Feature Flag Pattern

Add complexity behind flags:

// config/features.ts
export const features = {
  useCache: process.env.USE_CACHE === "true",
  useQueue: process.env.USE_QUEUE === "true",
  useEmailService: process.env.USE_EMAIL_SERVICE === "true",
};
 
// lib/email.ts
import { features } from "@/config/features";
 
export async function sendEmail(to: string, subject: string, body: string) {
  if (features.useEmailService) {
    // Call external service
    return await emailService.send({ to, subject, body });
  }
 
  // Simple implementation
  return await smtp.send({ to, subject, body });
}

Benefits:

  • Test complex systems in production
  • Easy rollback
  • Gradual migration
  • A/B test architectures

The Adapter Pattern

Abstract complexity behind simple interfaces:

// lib/storage/interface.ts
export interface Storage {
  get(key: string): Promise<string | null>;
  set(key: string, value: string): Promise<void>;
  delete(key: string): Promise<void>;
}
 
// lib/storage/memory.ts
export class MemoryStorage implements Storage {
  private store = new Map<string, string>();
 
  async get(key: string) {
    return this.store.get(key) ?? null;
  }
 
  async set(key: string, value: string) {
    this.store.set(key, value);
  }
 
  async delete(key: string) {
    this.store.delete(key);
  }
}
 
// lib/storage/redis.ts
export class RedisStorage implements Storage {
  async get(key: string) {
    return await redis.get(key);
  }
 
  async set(key: string, value: string) {
    await redis.set(key, value);
  }
 
  async delete(key: string) {
    await redis.del(key);
  }
}
 
// lib/storage/index.ts
export const storage: Storage =
  process.env.USE_REDIS === "true" ? new RedisStorage() : new MemoryStorage();

Start simple, swap when needed.

The Inline-Then-Extract Pattern

Start with everything inline:

// Version 1: Inline
export async function createUser(email: string, password: string) {
  // Validate
  if (!email.includes("@")) throw new Error("Invalid email");
  if (password.length < 8) throw new Error("Password too short");
 
  // Hash password
  const hashedPassword = await bcrypt.hash(password, 10);
 
  // Create user
  const user = await db.user.create({
    data: { email, password: hashedPassword },
  });
 
  // Send welcome email
  await smtp.send({
    to: email,
    subject: "Welcome!",
    body: "Thanks for signing up",
  });
 
  return user;
}

Extract only when you need reuse:

// Version 2: Extract when needed
import { validateEmail, validatePassword } from "./validators";
import { hashPassword } from "./crypto";
import { sendWelcomeEmail } from "./emails";
 
export async function createUser(email: string, password: string) {
  validateEmail(email);
  validatePassword(password);
 
  const hashedPassword = await hashPassword(password);
  const user = await db.user.create({
    data: { email, password: hashedPassword },
  });
 
  await sendWelcomeEmail(user.email);
  return user;
}

Don't extract prematurely. Extract when you feel the pain.

Warning Signs You're Overcomplicating

Watch for these red flags:

The "We Might Need" Trap

// ❌ Bad
"We might need to support multiple payment providers";
// So you build a complex abstraction layer on day one
 
// ✅ Good
"We use Stripe now, we'll abstract if we add a second provider";

The "Industry Standard" Trap

// ❌ Bad
"All serious companies use microservices";
 
// ✅ Good
"Shopify used a monolith until $100M ARR. We're at $100K.";

The "Can't Change Later" Trap

// ❌ Bad
"If we don't use microservices now, we'll never be able to scale";
 
// ✅ Good
"We can migrate to microservices when we have evidence we need to";

The "Learning Exercise" Trap

// ❌ Bad (on production code)
"Let's use this new framework so I can learn it";
 
// ✅ Good
"Let's use boring, proven tech for the product and experiment on side projects";

Companies That Followed Gall's Law

Instagram

Started: Python/Django monolith
Scale: 30M users before significant architecture changes
Lesson: Simple systems can scale surprisingly far

Shopify

Started: Rails monolith
Today: Still mostly a monolith at billions in revenue
Lesson: Modular monoliths scale

Stack Overflow

Started: ASP.NET monolith
Today: Still a monolith serving billions of requests
Lesson: Vertical scaling works

Basecamp (37signals)

Started: Rails monolith
Today: Still a monolith, powers multiple products
Lesson: Simple architectures let small teams build big products

The Evolution Checklist

Here's how I evolve systems:

Phase One: Proof of Concept

  • Single file if possible
  • SQLite or in-memory data
  • No abstractions
  • Ship in days

Phase Two: Working Product

  • Organized folders
  • Postgres or production database
  • Basic error handling
  • Ship in weeks

Phase Three: Growing Product

  • Feature modules
  • Service layer
  • Comprehensive tests
  • Monitoring basics

Phase Four: Scaling Product

  • Extract services with evidence
  • Add queue for async work
  • Add cache for hot paths
  • Advanced monitoring

Each phase builds on the previous one. Never skip.

My Current Project: A Case Study

I'm building a SaaS for developer analytics. Here's my roadmap:

Month One (Current):

// Single Next.js app
// Postgres database
// No cache
// No queue
// Vercel deployment
// 100 users

Month Three (If growth happens):

// Still monolith
// Add Redis for sessions
// Add simple background jobs
// 1,000 users

Month Six (If still growing):

// Modular monolith
// Dedicated worker processes
// CDN for assets
// 10,000 users

Month Twelve (If hitting limits):

// Consider extracting heavy services
// Only if metrics show need
// 50,000+ users

Notice: Each step is triggered by actual need, not speculation.

Practical Exercises

Try this on your next project:

Exercise One: The Inline Challenge

Force yourself to write inline code for the first version:

// Do this first
export async function processOrder(orderId: string) {
  // Get order
  const order = await db.order.findUnique({ where: { id: orderId } });
  if (!order) throw new Error("Order not found");
 
  // Charge customer
  const charge = await stripe.charges.create({
    amount: order.total,
    customer: order.customerId,
  });
 
  // Update order
  await db.order.update({
    where: { id: orderId },
    data: { status: "paid", chargeId: charge.id },
  });
 
  // Send confirmation
  await sendEmail(order.customerEmail, "Order confirmed!");
 
  return order;
}
 
// Resist the urge to create OrderService, PaymentService, EmailService
// Extract only when you have multiple callers

Exercise Two: The Deletion Challenge

Go through your codebase and delete abstractions you added "just in case":

// Can you delete this?
interface PaymentProvider {
  charge(amount: number): Promise<Charge>;
}
 
class StripeProvider implements PaymentProvider { ... }
class PayPalProvider implements PaymentProvider { ... }
 
// If you only use Stripe, delete the abstraction
// Add it back when you add a second provider

Exercise Three: The Measurement Challenge

Before adding complexity, measure:

// Add this first
import { performance } from "perf_hooks";
 
const start = performance.now();
await expensiveOperation();
const end = performance.now();
 
console.log(`Operation took ${end - start}ms`);
 
// Only optimize if it's actually slow

Conclusion: Embrace Simplicity

Gall's Law isn't just about architecture. It's a mindset:

  • Start small: Ship the simplest thing that could work
  • Measure: Use data, not assumptions
  • Evolve: Add complexity only when you feel the pain
  • Trust the process: Simple systems become complex naturally

The best developers I know aren't the ones who build the most complex systems. They're the ones who build the simplest system that solves the problem.

Remember: A working simple system beats a theoretical complex one every time.

Your users don't care about your architecture. They care that your product works.

Start simple. Stay simple as long as possible. Evolve when you must.


Further Reading:

  • "Systemantics" by John Gall
  • "A Philosophy of Software Design" by John Ousterhout
  • "The Pragmatic Programmer" by Hunt and Thomas