Logging, metrics, and tracing

You can't debug what you can't see, and once code is in production you can't step through it with a debugger. Observability is building in the means to understand a running system from the outside. The debugging discipline lesson said to make systems explain themselves — this is how.

The three pillars

Each answers a different question:

Logs — discrete, timestamped events. "What happened, and when?" The detailed narrative of individual events.
Metrics — numbers aggregated over time. "How much, how often, how fast?" Request rate, error rate, latency, memory. Cheap to store, great for trends and dashboards.
Traces — the path of a single request as it flows across functions and services. "Where did the time go in this one request?" Essential once a system spans multiple services.

Logs tell the story, metrics show the trends, traces follow one request through the whole system. You want all three.

Structured logs

A log you can't search is nearly useless at scale. Prefer structured logs — key/value or JSON — over free-form prose:

{"level":"error","event":"payment_failed","user_id":42,"amount":19.99,"reason":"card_declined"}

Now you can query "all payment_failed events for user 42." Log the context that makes an event findable and actionable; skip noise. And never log secrets or personal data you wouldn't want leaked.

Observability is a design choice

The key shift: observability isn't bolted on after an outage — it's designed in before you need it. Instrument the important flows (the systems-thinking inputs, outputs, and side effects) as you build them. A system that explains itself turns a 2am incident from guesswork into reading the evidence.

A practical signal of good observability: when something breaks, can you tell what and where from your dashboards and logs alone, without adding new logging and redeploying? If you have to redeploy to debug, you instrumented too little.

Where to go next

Collecting signals is step one; being told when they go wrong is step two. Next: monitoring and alerting.

Finished reading? Mark it complete to track your progress.

Logging, metrics, and tracing

The three pillars

Structured logs

Observability is a design choice

Where to go next

On this page