Published Oct 5, 2024

Debugging and Mental Models

Most hard bugs aren't hard because of the code — they're hard because your mental model of the system is wrong. Here's how to fix that first.

Slate blue abstract pattern representing debugging

Most bugs worth worrying about aren’t discovered by reading the error message. The error message tells you where the system diverged from your expectations. The bug is usually somewhere in the gap between your mental model and what the system actually does.

Fixing the code without fixing the mental model is how you get the same bug twice.

The Two Kinds of Hard Bugs

Hard bugs come in two flavors:

  1. Novel bugs — the code does something you genuinely didn’t expect, often due to a language quirk, a library behavior, or an edge case in the data.
  2. Model bugs — the code does exactly what you wrote, but what you wrote was based on a wrong assumption about the system.

Model bugs are more common and more dangerous. They often look like novel bugs until you slow down and ask: what did I think was true here, and is it actually true?

A Common Example

Consider this JavaScript:

const user = await getUser(id);
if (user.role === "admin") {
  await grantAccess(resource);
}

If grantAccess is sometimes called when it shouldn’t be, the first instinct is to check the role field, the getUser logic, the ACL table. But the bug might be something simpler: getUser returns null for anonymous users, and null.role throws before the condition is evaluated. Your mental model said “getUser always returns a user.” The system disagreed.

A Debugging Process

When a bug resists the first two fixes, this process helps:

  1. Write down what you believe to be true. In plain text. “I believe getUser always returns a non-null value for any valid session.” Writing forces precision.
  2. Find the cheapest way to test each belief. A console.log, an assertion, a temporary check. Not a rewrite.
  3. Test beliefs in order of how wrong they’d feel. Start with the assumptions that would surprise you most. That’s where the bug usually is.
  4. Update your model before touching the code. Once you know which belief was wrong, write down the corrected version. Then fix the code to match.

The order matters. Many debugging sessions waste time because the code changes before the model is corrected, which introduces new assumptions on top of old ones.

Using Assertions as Documentation

One of the most underused tools in a debugging session is the assertion:

def process_payment(order):
    assert order is not None, "Expected a non-null order"
    assert order.status == "pending", f"Expected pending, got {order.status}"
    # ... rest of function

Assertions do two things at once: they check a belief at runtime, and they document what you assumed. If you remove the assertion after the bug is fixed, you’ve lost the documentation. Consider leaving them in — or converting them to proper error handling that carries the same message.

if order is None:
    raise ValueError("process_payment requires a non-null order")
if order.status != "pending":
    raise ValueError(f"Cannot process order in status: {order.status!r}")

This version gives future debuggers the same context, but it survives production.

Reading Stack Traces

A stack trace tells you the call sequence that led to an error, not the root cause. The error often appears at the bottom of a call chain, far from where the bad data entered the system.

Develop the habit of reading stack traces bottom-to-top:

  • The bottom is the proximate error — where the crash happened
  • The top is the origin — where the call chain started, often where the assumption was made

The bug usually lives closer to the top. The error message lives at the bottom. Most people start at the bottom and stay there.

The Mental Model Is the Bug

“A bug is not a code problem. It’s a communication failure between what you intended and what the computer did.”

When you find a model bug, resist the urge to just fix the code and move on. Ask:

  • How did this wrong assumption enter my model?
  • Was it undocumented behavior? A stale comment? An API I misread?
  • What would have caught this earlier? A type, an assertion, a test?

The goal isn’t just to fix this bug — it’s to build a process that catches the same class of bug before it reaches production. That usually means better types, better tests, or better documentation of the surprising parts of the system.

Debugging is ultimately a practice in epistemology. You’re constantly asking: what do I know, how do I know it, and how might I be wrong?