As applications become more complex, having good logs can be very useful, not only when debugging but also to provide insight for application issues/performance.
The Python standard library comes with a logging module that provides most of the basic logging features. By setting it up correctly, a log message can bring a lot of useful information about when and where the log is fired as well as the log context, such as the running process/thread.
Despite the advantages, the logging module is often overlooked as it takes some time to set up properly and, although complete, in my opinion, the official logging doc at https://docs.python.org/3/library/logging.html does not really give logging best practices or highlight some logging surprises.
This Python logging tutorial is not meant to be a complete document on the logging module but rather a “getting started” guide that introduces some logging concepts as well as some “gotchas” to watch out for. The post will end with best practices and contain some pointers to more advanced logging topics.
Please note that all code snippets in the post suppose that you have already imported the logging module:
Concepts for Python Logging
This section gives an overview on some concepts that are often encountered in the logging module.
Python Logging Levels
The log level corresponds to the “importance” a log is given: an “error” log should be more urgent then than the “warn” log, whereas a “debug” log should be useful only when debugging the application.
There are six log levels in Python; each level is associated with an integer that indicates the log severity: NOTSET=0, DEBUG=10, INFO=20, WARN=30, ERROR=40, and CRITICAL=50.
All the levels are rather straightforward (DEBUG < INFO < WARN ) except NOTSET, whose particularity will be addressed next.
Python Logging Formatting
The log formatter basically enriches a log message by adding context information to it. It can be useful to know when the log is sent, where (Python file, line number, method, etc.), and additional context such as the thread and process (can be extremely useful when debugging a multithreaded application).
For example, when a log “hello world” is sent through a log formatter:
"%(asctime)s — %(name)s — %(levelname)s — %(funcName)s:%(lineno)d — %(message)s"
it will become
2018-02-07 19:47:41,864 - a.b.c - WARNING - <module>:1 - hello world
Python Logging Handler
The log handler is the component that effectively writes/displays a log: Display it in the console (via StreamHandler), in a file (via FileHandler), or even by sending you an email via SMTPHandler, etc.
Each log handler has 2 important fields:
- A formatter which adds context information to a log.
- A log level that filters out logs whose levels are inferior. So a log handler with the INFO level will not handle DEBUG logs.
The standard library provides a handful of handlers that should be enough for common use cases: https://docs.python.org/3/library/logging.handlers.html#module-logging.handlers. The most common ones are StreamHandler and FileHandler:
console_handler = logging.StreamHandler() file_handler = logging.FileHandler("filename")
Logger is probably the one that will be used directly the most often in the code and which is also the most complicated. A new logger can be obtained by:
toto_logger = logging.getLogger("toto")
A logger has three main fields:
- Propagate: Decides whether a log should be propagated to the logger’s parent. By default, its value is True.
- A level: Like the log handler level, the logger level is used to filter out “less important” logs. Except, unlike the log handler, the level is only checked at the “child” logger; once the log is propagated to its parents, the level will not be checked. This is rather an un-intuitive behavior.
- Handlers: The list of handlers that a log will be sent to when it arrives to a logger. This allows a flexible log handling—for example, you can have a file log handler that logs all DEBUG logs and an email log handler that will only be used for CRITICAL logs. In this regard, the logger-handler relationship is similar to a publisher-consumer one: A log will be broadcast to all handlers once it passes the logger level check.
A logger is unique by name, meaning that if a logger with the name “toto” has been created, the consequent calls of
logging.getLogger("toto") will return the same object:
assert id(logging.getLogger("toto")) == id(logging.getLogger("toto"))
As you might have guessed, loggers have a hierarchy. On top of the hierarchy is the root logger, which can be accessed via logging.root. This logger is called when methods like
logging.debug() is used. By default, the root log level is WARN, so every log with lower level (for example via
logging.info("info")) will be ignored. Another particularity of the root logger is that its default handler will be created the first time a log with a level greater than WARN is logged. Using the root logger directly or indirectly via methods like
logging.debug() is generally not recommended.
By default, when a new logger is created, its parent will be set to the root logger:
lab = logging.getLogger("a.b") assert lab.parent == logging.root # lab's parent is indeed the root logger
However, the logger uses the “dot notation,” meaning that a logger with the name “a.b” will be the child of the logger “a.” However, this is only true if the logger “a” has been created, otherwise “ab” parent is still the root.
la = logging.getLogger("a") assert lab.parent == la # lab's parent is now la instead of root
When a logger decides whether a log should pass according to the level check (e.g., if the log level is lower than logger level, the log will be ignored), it uses its “effective level” instead of the actual level. The effective level is the same as logger level if the level is not NOTSET, i.e., all the values from DEBUG up to CRITICAL; however, if the logger level is NOTSET, then the effective level will be the first ancestor level that has a non-NOTSET level.
By default, a new logger has the NOTSET level, and as the root logger has a WARN level, the logger’s effective level will be WARN. So even if a new logger has some handlers attached, these handlers will not be called unless the log level exceeds WARN:
toto_logger = logging.getLogger("toto") assert toto_logger.level == logging.NOTSET # new logger has NOTSET level assert toto_logger.getEffectiveLevel() == logging.WARN # and its effective level is the root logger level, i.e. WARN # attach a console handler to toto_logger console_handler = logging.StreamHandler() toto_logger.addHandler(console_handler) toto_logger.debug("debug") # nothing is displayed as the log level DEBUG is smaller than toto effective level toto_logger.setLevel(logging.DEBUG) toto_logger.debug("debug message") # now you should see "debug message" on screen
By default, the logger level will be used to decide of the a log passes: If the log level is lower than logger level, the log will be ignored.
Python Logging Best Practices
The logging module is indeed very handy, but it contains some quirks that can cause long hours of headache for even the best Python developers. Here are the best practices for using this module in my opinion:
- Configure the root logger but never use it in your code—e.g., never call a function like
logging.info(), which actually calls the root logger behind the scene. If you want to catch error messages from libraries you use, make sure to configure the root logger to write to a file, for example, to make the debugging easier. By default, the root logger only outputs to
stderr, so the log can get lost easily.
- To use the logging, make sure to create a new logger by using
logging.getLogger(logger name). I usually use
__name__as the logger name, but anything can be used, as long as it is consistent. To add more handlers, I usually have a method that returns a logger (you can find the gist on https://gist.github.com/nguyenkims/e92df0f8bd49973f0c94bddf36ed7fd0).
import logging import sys from logging.handlers import TimedRotatingFileHandler FORMATTER = logging.Formatter("%(asctime)s — %(name)s — %(levelname)s — %(message)s") LOG_FILE = "my_app.log" def get_console_handler(): console_handler = logging.StreamHandler(sys.stdout) console_handler.setFormatter(FORMATTER) return console_handler def get_file_handler(): file_handler = TimedRotatingFileHandler(LOG_FILE, when='midnight') file_handler.setFormatter(FORMATTER) return file_handler def get_logger(logger_name): logger = logging.getLogger(logger_name) logger.setLevel(logging.DEBUG) # better to have too much log than not enough logger.addHandler(get_console_handler()) logger.addHandler(get_file_handler()) # with this pattern, it's rarely necessary to propagate the error up to parent logger.propagate = False return logger
After you can create a new logger and use it:
my_logger = get_logger("my module name") my_logger.debug("a debug message")
- Use RotatingFileHandler classes, such as the TimedRotatingFileHandler used in the example instead of FileHandler, as it will rotate the file for you automatically when the file reaches a size limit or do it everyday.
- Use tools like Sentry, Airbrake, Raygun, etc., to catch error logs automatically for you. This is especially useful in the context of a web app, where the log can be very verbose and error logs can get lost easily. Another advantage of using these tools is that you can get details about variable values in the error so you can know what URL triggers the error, which user is concerned, etc.
If you’re interested in more best practices, read The 10 Most Common Mistakes That Python Developers Make by fellow Toptaler Martin Chikilian.
Understanding the basics
A debugging tool is a tool that allows developers to spot errors and investigate problems. It can be a command line tool such as gdb, pdb (for Python) or can be embedded in an IDE (Visual Studio, idea suites, etc)
This is simply the output of the program, a better version of “print” in layman's terms. In the context of a web application, this log usually contains the incoming request information such as request path, request time, HTTP status, etc.
Logging is a module in the Python standard library that provides a richly-formatted log with a flexible filter and the possibility to redirect logs to other sources such as syslog or email.
The most popular python debugger is pdb. There are currently some projects working to improve the usability of pdb by providing tab-completion, color syntax, code browsing or remote debugging. These projects include ipdb, pudb, and wdb. There are also some IDE-specific debuggers such as the pydev engine or PTVS.