Think twice before logging

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

balloon_irrelevant

One property of legacy code is inflation by irrelevant logging statements. Not only does this increase the size of a bulging code base, I’d also argue that it’s dead wrong.

Quite recently I’ve had the honor of making acquaintance with a piece of code that looked roughly like this:

if (LOGGER.isLoggable(Level.FINEST)) {
    LOGGER.log(Level.FINEST, "foo is now", foo.getValue());
}
boolean result = doSomeActualWork(foo);
if (LOGGER.isLoggable(Level.FINER)) {
    LOGGER.exiting(this.getClass().getName(), "bar", result);
}


This was the end of a method. The beginning looked quite similar, and so did thousands of other methods in that code base. It was most definitively not my first encounter with this “pattern”, and it certainly won’t be the last, but this time I decided to spend some time explaining why I think this is a bad thing to do.

The issue

Why is this problematic? Logging things should indicate a considerate programmer who tries to be defensive and friendly by helping you to get a bunch of randomly selected values in your face, should you decide to turn on/up the logging.

The issue has to do with noise, relevance, and cyclomatic complexity. Look at the seven lines of code above. Out of seven lines, only one is important. That’s the line you’ll read when changing the code. That’s the line you’ll need to understand … And that’s the line you’ll need to filter out from six totally irrelevant lines. The illustration at the beginning of this post shows what the code looks like if you squint when reading it. Legacy code bases already suffer from misused and overused design patterns, superfluous comments, and learning on the job code, so you don’t need to add more lines of code. Your predecessors have already done it for you.

Then there’s the increase in cyclomatic complexity. See the two if statements? They turn a harmless piece of code into something that has two conditional statements that your brain needs to process.

–Is the log level set to finest? Jolly good! Let’s log the value of foo just in case somebody will need it.

–Are we logging at the finer level now? Exciting! Let’s tell everybody that we’re leaving the method.

Now you have introduced conditional accidental complexity.

The reasons

Why is code like this created? I’m just guessing, but here are my theories:

A genuine will to help – People want to help. The programmer who wrote this code wanted to help future readers by logging things that were important to him.

Lack of unit tests – The inability to verify assumptions about the code during development often translates to the second best thingverification at runtime.

Poor controllability and observability – Some systems are so (accidently) complex to set up and monitor that even debugging won’t help. This is yet another reason for sprinkling log statements throughout the code.

Ignorance – Then there’s the issue of actually understanding how a logging framework actually works. It so happens that logging frameworks check the log level themselves in the output methods, so you don’t have to duplicate the effort. The if statements in the example above are only necessary if the cost of printing the object is so high that it would impact performance.

Groupthink – It’s really safe to repeat patterns seen frequently enough throughout the code base.

The exceptions

At this point, I wish I could just say: “Don’t ever use logging. Write your unit tests and start trusting your code!” But there are actual reasons in favor of logging. I can think of two, and I invite people to comment on this post and give me some more!

  1. Logging used with a purpose – This kind of logging is part of someone’s requirements. Either the customer or another stakeholder wants certain kind of information, which is conveniently available through logging. It may be some kind of transaction history used for auditing or performance figures for further analysis. In such cases, the logged information should be part of the requirements, and the developers should write tests to verify that it works.
  2. Logging required by operations – This is actually a variant of the reason above. If you’ve worked in operations you know that there are cases in which you want something logged. You may want to get a feeling for how often something’s used, or you may suspect that something’s broken, but you can’t reproduce it. In such cases, having things logged is quite helpful. The difference from the first case is that the logging may be temporary and won’t require its own tests.

Summary

Don’t fly on auto pilot and add tons of log statements to your applications because everyone else does. It just makes the code unreadable and harder to maintain. If you want to be certain about your code, write tests! Save the logging for when it matters: If it’s an actual requirement or if it will help operations.

3 Comments

  • 1
    Mattias
    2016-02-02 - 17:01 | Permalink

    Nice post about logging. I admit I have made similar logging mistakes myself 🙂

    Hope in the future we can see a post about recommended logging patterns.

  • 2
    Mason Deaver
    2016-11-25 - 04:05 | Permalink

    I know another exception to this advice, in a real-time system I’ve worked with for years: We can dynamically turn on logging to help debug complex problems that may involve multiple processes in our real-time system.

    Our inserted log statements look something like this (C code):

    if( bDebug == TRUE )
    {
    // Call logging function
    }

    These are sprinkled throughout our code. The logged information is written to a file on our target computer. “bDebug” is a global variable. Most of the time this variable is set to FALSE and so the impact on performance is minimal. When we enable debug on the fly, while a process is running, we collect what amounts to a programmer’s trace through the code.

    Why not verify the working code ahead of time via automated tests? We admittedly don’t have enough automated tests in place, but for a real-time system like ours it has seemed difficult to get enough code coverage via automated tests. It seems a lot of automated test frameworks are built for Web-based applications that are not using the C language. [Or am I wrong here? What automated test systems and techniques are available for real-time systems that are not running on popular OSs like Linux or Windows? (Ours runs on QNX)]

    Why not monitor a process in an IDE’s debugger? This is sometimes helpful but one can get mired in the code. Our debug trace in effect provides a story of how an application is running in real-time. And if we need debugging information from someone in the field, they cannot run an IDE, instead they can capture a debug trace and send that back to us.

    We understand enabling debug trace like this can sometimes affect performance or hide a problem. Nevertheless, we have found this capability to be extremely helpful in understanding how our system’s processes work in real-time, and for help in debugging problems in the field.

    • 3
      2016-11-26 - 14:21 | Permalink

      Dear Mason,
      Thanks for a well-written and informative reply. I appreciate the time you have taken to write down your thoughts. Please remember, context is always king.
      Without knowing the specific challenges involved in developing something for QNX and what your definition of “realtime” is, I’d still encourage you to challenge your team’s assumptions about testability.
      If the code is written in C, you will come a long way using unit testing framework. Again, I do most of my C programming on Raspberry Pis, so I can use all unit testing frameworks available in Linux (Check being my current favorite), but there are options for QNX, or so Google claims.

      I’d say that your challenge seems to fall in the category of generic testability: inadequate contrallability and observability, which is why log statements work for you. Is there a way to challenge that? Can you refactor your code to make better use of unit tests? Or maybe your realtime also involves parallelism? In that case, maybe decreasing the number of things happening simultaneously can help. Maybe…

      Anyway, I know that you face

  • Leave a Reply

    Your email address will not be published. Required fields are marked *