Gall's Law in Software Development: Why Simple Systems Win
Understanding why complex systems that work always evolved from simple systems that worked—and why you can't skip that evolution.
I've watched countless projects fail because developers tried to build the "perfect" system from day one. Then I discovered Gall's Law, and everything clicked.
What is Gall's Law?
John Gall, in his 1975 book "Systemantics," stated:
"A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system."
Let me show you why this matters more than any design pattern you'll ever learn.
The Microservices Trap: A Real Story
A startup I consulted for wanted to "do things right from the start." Their plan for a basic task management app:
Planned Architecture (Day One):
├── User Service (Go)
├── Task Service (Node.js)
├── Notification Service (Python)
├── Analytics Service (Rust)
├── API Gateway (Kong)
├── Message Queue (RabbitMQ)
├── Service Mesh (Istio)
├── Kubernetes Cluster
└── Distributed Tracing (Jaeger)
Six months later: They had zero users, mounting AWS bills, and a team burned out from dealing with distributed systems complexity.
What actually worked:
// Week one: Simple Next.js app
// app/api/tasks/route.ts
import { prisma } from "@/lib/db";
export async function GET(request: Request) {
const tasks = await prisma.task.findMany();
return Response.json(tasks);
}
export async function POST(request: Request) {
const data = await request.json();
const task = await prisma.task.create({ data });
return Response.json(task);
}They shipped in two weeks. Got real users. Found actual problems. Then evolved.
Why We Overengineer
Before diving into solutions, let's understand why we fall into this trap:
The Resume-Driven Development:
- "I need Kafka on my CV"
- "Everyone's using Kubernetes"
- "This will make me look senior"
The Future-Proofing Fallacy:
- "We'll need to scale to millions"
- "What if we need multiple databases?"
- "Better build it right the first time"
The Hacker News Effect:
- Seeing Netflix's architecture and thinking you need the same
- Reading about problems you don't have
- Solutions looking for problems
I've been guilty of all three. Here's what I learned.
The Simple System Ladder
Instead of building complexity upfront, I now follow this progression:
Level One: Monolith with SQLite
Start here. Yes, really.
// lib/db.ts
import Database from "better-sqlite3";
const db = new Database("app.db");
export function getTasks(userId: string) {
return db.prepare("SELECT * FROM tasks WHERE user_id = ?").all(userId);
}
export function createTask(userId: string, title: string) {
return db
.prepare("INSERT INTO tasks (user_id, title) VALUES (?, ?)")
.run(userId, title);
}When to stay here:
- Less than 1,000 users
- Less than 100,000 database rows
- Single developer or small team
- MVP or proof of concept
Real talk: SQLite can handle millions of rows. Your startup probably won't get there.
Level Two: Monolith with Postgres
Migrate when you actually need it:
// lib/db.ts
import { Pool } from "pg";
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
});
export async function getTasks(userId: string) {
const result = await pool.query("SELECT * FROM tasks WHERE user_id = $1", [
userId,
]);
return result.rows;
}When to migrate:
- Multiple servers need database access
- You need better tooling or ecosystem
- You're actually experiencing SQLite limitations
- Team is comfortable with Postgres
Don't migrate because:
- Someone said "SQLite isn't production-ready"
- You read a blog post
- It feels more "professional"
Level Three: Modular Monolith
Break internal concerns into modules, not services:
// features/tasks/service.ts
export class TaskService {
constructor(private db: Database) {}
async getTasks(userId: string) {
return this.db.task.findMany({ where: { userId } });
}
async createTask(userId: string, data: CreateTaskDto) {
// Validation
// Business logic
// Database operation
return this.db.task.create({ data: { ...data, userId } });
}
}
// features/notifications/service.ts
export class NotificationService {
async notify(userId: string, message: string) {
// Send notification
}
}
// features/tasks/api.ts
import { taskService } from "./service";
import { notificationService } from "@/features/notifications/service";
export async function createTask(userId: string, data: CreateTaskDto) {
const task = await taskService.createTask(userId, data);
await notificationService.notify(userId, "Task created!");
return task;
}Benefits:
- Clear boundaries
- Easy to test
- Simple to refactor
- One deployment
- No network calls
When to use:
- Multiple developers
- Growing codebase
- Need clear separation
- Not ready for microservices
Level Four: Extract Critical Services
Only when you have evidence:
// When notifications are slowing down task creation...
// tasks-service/src/api/tasks.ts
export async function createTask(userId: string, data: CreateTaskDto) {
const task = await db.task.create({ data });
// Queue notification instead of blocking
await queue.publish("notifications", {
type: "task.created",
userId,
taskId: task.id,
});
return task;
}
// notification-service/src/worker.ts
queue.subscribe("notifications", async (message) => {
await sendEmail(message.userId, message.taskId);
});When to extract:
- Service has different scaling needs
- Different deployment cycles needed
- Performance bottleneck identified
- Team size justifies complexity
Real-World Evolution: How Stripe Did It
Stripe didn't start with microservices. Here's their actual evolution:
2010 - Simple Rails App:
# One file did everything
class ChargesController < ApplicationController
def create
charge = Stripe::Charge.create(
amount: params[:amount],
currency: 'usd',
source: params[:token]
)
render json: charge
end
end2012 - Modular Monolith:
# Separated concerns, same codebase
module Billing
class ChargeService
def create_charge(params)
# Logic here
end
end
end
module Fraud
class DetectionService
def check_transaction(charge)
# Logic here
end
end
end2014 - Strategic Service Extraction: Only after identifying bottlenecks, they extracted specific services like fraud detection and webhooks.
Key insight: They waited until they had real data about what needed to scale independently.
The Cost of Complexity
Let me quantify what premature complexity costs:
Development Speed
Simple system:
// Change takes 5 minutes
export async function createTask(data: CreateTaskDto) {
return db.task.create({ data });
}Complex system:
// Same change takes 2 hours
// 1. Update task service
// 2. Update API gateway config
// 3. Update message schemas
// 4. Update consumer services
// 5. Deploy five services
// 6. Monitor distributed traces
// 7. Debug cross-service issuesCognitive Load
Simple system:
- One codebase to understand
- Straightforward debugging
- Clear error messages
- Local testing works
Complex system:
- Multiple repos to navigate
- Distributed debugging
- Network errors vs. logic errors
- Requires full infrastructure to test
Infrastructure Costs
My SaaS app comparison:
Simple (Monolith on single server):
Vercel Hobby: $0/month
Postgres: $25/month
Total: $25/month for 10,000 users
Complex (Microservices):
Kubernetes cluster: $200/month
RabbitMQ: $50/month
Redis: $30/month
Monitoring: $50/month
Service mesh: $100/month
Total: $430/month for 10,000 users
ROI of complexity: Negative until you hit scale.
How to Apply Gall's Law Practically
Here's my decision framework:
Starting a New Project
// ✅ Do this first
const app = {
framework: "Next.js",
database: "SQLite or Postgres",
hosting: "Vercel or single VPS",
architecture: "Monolith",
queue: "None (add when needed)",
cache: "None (add when needed)",
};
// ❌ Don't do this first
const overengineered = {
services: ["user", "auth", "api", "worker", "analytics"],
databases: ["Postgres", "Redis", "MongoDB"],
infrastructure: ["Kubernetes", "Istio", "Kafka"],
monitoring: ["Prometheus", "Grafana", "Jaeger"],
};When You Actually Need Complexity
Ask these questions before adding complexity:
Is this a real problem or a hypothetical one?
// ❌ Hypothetical
"We might need to scale to millions of users";
// ✅ Real
"We have 500,000 users and the database is at 80% CPU";Have you measured it?
// ❌ No data
"This endpoint feels slow";
// ✅ Has data
"P95 latency is 3s, target is 500ms, profiler shows DB query takes 2.8s";Can you solve it simply first?
// ❌ Jump to microservices
"Let's extract this to a separate service";
// ✅ Try simple solutions
"Let's add an index / cache this query / optimize the query";Will it reduce complexity elsewhere?
// ✅ Good reason
"Extracting email service lets us deploy notifications without touching payments";
// ❌ Bad reason
"Microservices are best practice";Patterns for Simple Systems
Here are patterns I use to keep systems simple:
The Feature Flag Pattern
Add complexity behind flags:
// config/features.ts
export const features = {
useCache: process.env.USE_CACHE === "true",
useQueue: process.env.USE_QUEUE === "true",
useEmailService: process.env.USE_EMAIL_SERVICE === "true",
};
// lib/email.ts
import { features } from "@/config/features";
export async function sendEmail(to: string, subject: string, body: string) {
if (features.useEmailService) {
// Call external service
return await emailService.send({ to, subject, body });
}
// Simple implementation
return await smtp.send({ to, subject, body });
}Benefits:
- Test complex systems in production
- Easy rollback
- Gradual migration
- A/B test architectures
The Adapter Pattern
Abstract complexity behind simple interfaces:
// lib/storage/interface.ts
export interface Storage {
get(key: string): Promise<string | null>;
set(key: string, value: string): Promise<void>;
delete(key: string): Promise<void>;
}
// lib/storage/memory.ts
export class MemoryStorage implements Storage {
private store = new Map<string, string>();
async get(key: string) {
return this.store.get(key) ?? null;
}
async set(key: string, value: string) {
this.store.set(key, value);
}
async delete(key: string) {
this.store.delete(key);
}
}
// lib/storage/redis.ts
export class RedisStorage implements Storage {
async get(key: string) {
return await redis.get(key);
}
async set(key: string, value: string) {
await redis.set(key, value);
}
async delete(key: string) {
await redis.del(key);
}
}
// lib/storage/index.ts
export const storage: Storage =
process.env.USE_REDIS === "true" ? new RedisStorage() : new MemoryStorage();Start simple, swap when needed.
The Inline-Then-Extract Pattern
Start with everything inline:
// Version 1: Inline
export async function createUser(email: string, password: string) {
// Validate
if (!email.includes("@")) throw new Error("Invalid email");
if (password.length < 8) throw new Error("Password too short");
// Hash password
const hashedPassword = await bcrypt.hash(password, 10);
// Create user
const user = await db.user.create({
data: { email, password: hashedPassword },
});
// Send welcome email
await smtp.send({
to: email,
subject: "Welcome!",
body: "Thanks for signing up",
});
return user;
}Extract only when you need reuse:
// Version 2: Extract when needed
import { validateEmail, validatePassword } from "./validators";
import { hashPassword } from "./crypto";
import { sendWelcomeEmail } from "./emails";
export async function createUser(email: string, password: string) {
validateEmail(email);
validatePassword(password);
const hashedPassword = await hashPassword(password);
const user = await db.user.create({
data: { email, password: hashedPassword },
});
await sendWelcomeEmail(user.email);
return user;
}Don't extract prematurely. Extract when you feel the pain.
Warning Signs You're Overcomplicating
Watch for these red flags:
The "We Might Need" Trap
// ❌ Bad
"We might need to support multiple payment providers";
// So you build a complex abstraction layer on day one
// ✅ Good
"We use Stripe now, we'll abstract if we add a second provider";The "Industry Standard" Trap
// ❌ Bad
"All serious companies use microservices";
// ✅ Good
"Shopify used a monolith until $100M ARR. We're at $100K.";The "Can't Change Later" Trap
// ❌ Bad
"If we don't use microservices now, we'll never be able to scale";
// ✅ Good
"We can migrate to microservices when we have evidence we need to";The "Learning Exercise" Trap
// ❌ Bad (on production code)
"Let's use this new framework so I can learn it";
// ✅ Good
"Let's use boring, proven tech for the product and experiment on side projects";Companies That Followed Gall's Law
Started: Python/Django monolith
Scale: 30M users before significant architecture changes
Lesson: Simple systems can scale surprisingly far
Shopify
Started: Rails monolith
Today: Still mostly a monolith at billions in revenue
Lesson: Modular monoliths scale
Stack Overflow
Started: ASP.NET monolith
Today: Still a monolith serving billions of requests
Lesson: Vertical scaling works
Basecamp (37signals)
Started: Rails monolith
Today: Still a monolith, powers multiple products
Lesson: Simple architectures let small teams build big products
The Evolution Checklist
Here's how I evolve systems:
Phase One: Proof of Concept
- Single file if possible
- SQLite or in-memory data
- No abstractions
- Ship in days
Phase Two: Working Product
- Organized folders
- Postgres or production database
- Basic error handling
- Ship in weeks
Phase Three: Growing Product
- Feature modules
- Service layer
- Comprehensive tests
- Monitoring basics
Phase Four: Scaling Product
- Extract services with evidence
- Add queue for async work
- Add cache for hot paths
- Advanced monitoring
Each phase builds on the previous one. Never skip.
My Current Project: A Case Study
I'm building a SaaS for developer analytics. Here's my roadmap:
Month One (Current):
// Single Next.js app
// Postgres database
// No cache
// No queue
// Vercel deployment
// 100 usersMonth Three (If growth happens):
// Still monolith
// Add Redis for sessions
// Add simple background jobs
// 1,000 usersMonth Six (If still growing):
// Modular monolith
// Dedicated worker processes
// CDN for assets
// 10,000 usersMonth Twelve (If hitting limits):
// Consider extracting heavy services
// Only if metrics show need
// 50,000+ usersNotice: Each step is triggered by actual need, not speculation.
Practical Exercises
Try this on your next project:
Exercise One: The Inline Challenge
Force yourself to write inline code for the first version:
// Do this first
export async function processOrder(orderId: string) {
// Get order
const order = await db.order.findUnique({ where: { id: orderId } });
if (!order) throw new Error("Order not found");
// Charge customer
const charge = await stripe.charges.create({
amount: order.total,
customer: order.customerId,
});
// Update order
await db.order.update({
where: { id: orderId },
data: { status: "paid", chargeId: charge.id },
});
// Send confirmation
await sendEmail(order.customerEmail, "Order confirmed!");
return order;
}
// Resist the urge to create OrderService, PaymentService, EmailService
// Extract only when you have multiple callersExercise Two: The Deletion Challenge
Go through your codebase and delete abstractions you added "just in case":
// Can you delete this?
interface PaymentProvider {
charge(amount: number): Promise<Charge>;
}
class StripeProvider implements PaymentProvider { ... }
class PayPalProvider implements PaymentProvider { ... }
// If you only use Stripe, delete the abstraction
// Add it back when you add a second providerExercise Three: The Measurement Challenge
Before adding complexity, measure:
// Add this first
import { performance } from "perf_hooks";
const start = performance.now();
await expensiveOperation();
const end = performance.now();
console.log(`Operation took ${end - start}ms`);
// Only optimize if it's actually slowConclusion: Embrace Simplicity
Gall's Law isn't just about architecture. It's a mindset:
- Start small: Ship the simplest thing that could work
- Measure: Use data, not assumptions
- Evolve: Add complexity only when you feel the pain
- Trust the process: Simple systems become complex naturally
The best developers I know aren't the ones who build the most complex systems. They're the ones who build the simplest system that solves the problem.
Remember: A working simple system beats a theoretical complex one every time.
Your users don't care about your architecture. They care that your product works.
Start simple. Stay simple as long as possible. Evolve when you must.
Further Reading:
- "Systemantics" by John Gall
- "A Philosophy of Software Design" by John Ousterhout
- "The Pragmatic Programmer" by Hunt and Thomas