smhk

Using structlog and journald

I was surprised to find practically nothing online regarding using structlog with journald. This seemed odd given how both are strong proponents of structured logging.

Update 2023-12-08: After writing this up, I did find this structlog issue, which seems to indicate there is interest in journald integration.

To skip straight to my example implementation, jump to the last section of these notes.

If you want to see how I got there, carry on reading. In the following notes I experiment with connecting structlog up to journald before converging onto a good solution. The end result is a Python application that uses structlog, and can be configured to send logs either to the console or journald. This allows the application to be run as a command line tool, or as a service under systemd.

Table of contents §

The three attempts are:

  • 🤔 First attempt: logging to journald from structlog from inside the processor chain (works, but not ideal).
  • 😐 Second attempt: fixing the double logging issue from the first attempt (still not ideal).
  • 🙂 Third attempt: implementing a custom journald logger (much better).

First attempt: logging to journald from structlog from inside the processor chain §

Implementation §

Since I’m using systemd-python, somewhere I need to call journal.send() to send the logs to journald. The ideal place for that looks like it would be in a logger, which is created by a LoggerFactory. However I don’t quite have my head around how it all fits together, so let’s start by writing to journald from within the processor chain, since this looks like an easy way to get our hands on the log event.

So first let’s create a JournaldRenderer. All it does is takes the incoming event_dict, converts it into the right format for journald, then calls journal.send() to write to journald. Then it passes the event_dict unmodified on to the next processor in the chain. It’s a little bit sneaky, but it should work.

py_sysd_pkg/journald_renderer.py
from structlog.typing import WrappedLogger, EventDict
from systemd import journal

class JournaldRenderer:
    def __call__(self, logger: WrappedLogger, name: str, event_dict: EventDict):
        event_dict_upper = {
            key.upper(): value
            for key, value in event_dict.items()
            if key != "event"
        }
        journal.send(event_dict["event"], **event_dict_upper)
        return event_dict

Next we need to call structlog.configure() somewhere and add our JournaldRenderer into the processors chain.

This is a very help example of a fast, production-ready use of structlog.configure(). We’ll use it as the starting point. Though instead of JSONRenderer we’ll use our JournaldRenderer, and instead of using structlog.BytesLoggerFactory() as the logger factory we’ll use structlog.PrintLoggerFactory(), which means our output will get printed to the console.

py_sysd_pkg/app_logging.py
import structlog
import logging

from py_sysd_pkg.journald_renderer import JournaldRenderer

def logging_setup():
    structlog.configure(
        cache_logger_on_first_use=True,
        wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
        processors=[
            structlog.processors.add_log_level,
            structlog.processors.TimeStamper(fmt="iso", utc=True),
            structlog.processors.StackInfoRenderer(),
            JournaldRenderer(),
            structlog.dev.ConsoleRenderer(),
        ],
        logger_factory=structlog.PrintLoggerFactory(),
    )

Then, in the very basic main loop for this application (created using these steps), we’ll call logging_setup() and do a bit of logging.

py_sysd_pkg/service.py
import time
import structlog

from py_sysd_pkg.app_logging import logging_setup

def main():
    logging_setup()

    logger = structlog.get_logger()

    logger.info("starting up")
    try:
        while True:
            t = time.time()
            logger.info("example message", my_counter=t)
            time.sleep(1)
    except SystemExit:
        logger.info("shutting down")

Testing §

When run from the CLI, we can see the logs are formatted nicely by the logger which our PrintLoggerFactory() has created:

$ poetry run service-run
2023-11-30T13:32:38.498161Z [info     ] starting up
2023-11-30T13:32:38.498610Z [info     ] example message                my_counter=1701351158.4986005
2023-11-30T13:32:39.501073Z [info     ] example message                my_counter=1701351159.5009449
2023-11-30T13:32:40.503285Z [info     ] example message                my_counter=1701351160.5032387
2023-11-30T13:32:41.505230Z [info     ] example message                my_counter=1701351161.505109
2023-11-30T13:32:42.507554Z [info     ] example message                my_counter=1701351162.5074105
^CTraceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/sam/projects/py_sysd_pkg/py_sysd_pkg/service.py", line 17, in main
    time.sleep(1)
KeyboardInterrupt

Then if we use journalctl, we can see it is logging to journald too:

Nov 30 13:33:03 sam-virtual-machine systemd[1]: Started Foo.
Nov 30 13:33:03 sam-virtual-machine python[50822]: starting up
Nov 30 13:33:03 sam-virtual-machine python[50822]: example message
Nov 30 13:33:03 sam-virtual-machine service-run[50822]: 2023-11-30T13:33:03.441396Z [info     ] starting up
Nov 30 13:33:03 sam-virtual-machine service-run[50822]: 2023-11-30T13:33:03.441794Z [info     ] example message                my_counter=1701351183.4417818
Nov 30 13:33:04 sam-virtual-machine python[50822]: example message
Nov 30 13:33:04 sam-virtual-machine service-run[50822]: 2023-11-30T13:33:04.443537Z [info     ] example message                my_counter=1701351184.443479
Nov 30 13:33:05 sam-virtual-machine python[50822]: example message
Nov 30 13:33:05 sam-virtual-machine service-run[50822]: 2023-11-30T13:33:05.445379Z [info     ] example message                my_counter=1701351185.4453325
Nov 30 13:33:06 sam-virtual-machine python[50822]: example message
Nov 30 13:33:06 sam-virtual-machine service-run[50822]: 2023-11-30T13:33:06.447454Z [info     ] example message                my_counter=1701351186.447395
Nov 30 13:33:07 sam-virtual-machine python[50822]: example message
Nov 30 13:33:07 sam-virtual-machine service-run[50822]: 2023-11-30T13:33:07.450095Z [info     ] example message                my_counter=1701351187.4500322

Notice anything wrong?

This is a good first pass, but has a signficant issue: everything gets logged to journald twice, the latter of which is unstructured. This is because we are explicitly writing to journald with journal.send(), and then journald is helpfully collecting the logs from our logger created by PrintLoggerFactory(), which are unstructured.

Second attempt: fixing the double logging §

Let’s build upon the previous work, but fix the double logging.

Implementaton §

First, we’ll update JournaldRenderer to add a journald_only option. If true, this will raise structlog.DropEvent after sending the log to journald, which will silently drop that log entry.

This way the log entry will always go to journald, but we can toggle whether logs make it to the console. This is not very efficient, but gets us a step closer (structlog has a clever solution for efficient dropping based on log level, which explains more about why filtering in this manner is not super efficient).

py_sysd_pkg/journald_renderer.py
import structlog
from structlog.typing import WrappedLogger, EventDict
from systemd import journal

class JournaldRenderer:
    def __init__(self, journald_only=False):
        self.journald_only = journald_only

    def __call__(self, logger: WrappedLogger, name: str, event_dict: EventDict):
        event_dict_upper = {
            key.upper(): value
            for key, value in event_dict.items()
            if key != "event"
        }
        journal.send(event_dict["event"], **event_dict_upper)

        if self.journald_only:
            raise structlog.DropEvent

        return event_dict

Next we need to update our configure code to pass the journald_only option through. This is just plumbing:

py_sysd_pkg/app_logging.py
import structlog
import logging

from py_sysd_pkg.journald_renderer import JournaldRenderer

def logging_setup(journald_only=False):
    structlog.configure(
        cache_logger_on_first_use=True,
        wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
        processors=[
            structlog.processors.add_log_level,
            structlog.processors.TimeStamper(fmt="iso", utc=True),
            structlog.processors.StackInfoRenderer(),
            JournaldRenderer(journald_only=journald_only),
            structlog.dev.ConsoleRenderer(),
        ],
        logger_factory=structlog.PrintLoggerFactory(),
    )

Then in our main application we can hard-code the journald_only to true or false as needed. (We’ll move it out to be configurable later):

py_sysd_pkg/service.py
import time
import structlog

from py_sysd_pkg.app_logging import logging_setup

def main():
    logging_setup(journald_only=True)

    logger = structlog.get_logger()

    logger.info("starting up")
    try:
        while True:
            t = time.time()
            logger.info("example message", my_counter=t)
            time.sleep(1)
    except SystemExit:
        logger.info("shutting down")

We’re now ready to test again.

Testing §

With journald_only=True, now nothing gets logged to the console:

$ poetry run service-run
^CTraceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/sam/projects/py_sysd_pkg/py_sysd_pkg/service.py", line 17, in main
    time.sleep(1)
KeyboardInterrupt

This is good.

If we use journalctl to view journald logs, we do see logs.

Nov 30 13:50:32 sam-virtual-machine systemd[1]: Started Foo.
Nov 30 13:50:32 sam-virtual-machine sudo[52445]: pam_unix(sudo:session): session closed for user root
Nov 30 13:50:32 sam-virtual-machine python[52449]: starting up
Nov 30 13:50:32 sam-virtual-machine python[52449]: example message
Nov 30 13:50:33 sam-virtual-machine python[52449]: example message
Nov 30 13:50:34 sam-virtual-machine python[52449]: example message
Nov 30 13:50:35 sam-virtual-machine python[52449]: example message
Nov 30 13:50:36 sam-virtual-machine python[52449]: example message
Nov 30 13:50:37 sam-virtual-machine python[52449]: example message
Nov 30 13:50:38 sam-virtual-machine python[52449]: example message
Nov 30 13:50:39 sam-virtual-machine python[52449]: example message
Nov 30 13:50:40 sam-virtual-machine python[52449]: example message
Nov 30 13:50:41 sam-virtual-machine python[52449]: example message
Nov 30 13:50:41 sam-virtual-machine systemd[1]: Stopping Foo...
Nov 30 13:50:41 sam-virtual-machine systemd[1]: foo.service: Deactivated successfully.
Nov 30 13:50:41 sam-virtual-machine systemd[1]: Stopped Foo.

Also good.

We have fixed the double logging issue, but the implementation is not ideal:

  • The logging is done from within the processor chain, which still seems wrong.
  • We can only toggle whether or not logs go to the console. Logs always go to journald. It would be nice to make this more flexible.

Third attempt: implementing a custom journald logger §

Let’s try and do things properly this time.

Implementation §

Putting the calls to journal.send() inside JournaldRenderer within the list of processors feels like the wrong approach. The processors are intended to manipulate the log event, not perform I/O. That should be done by the logger.

It wasn’t clear to me at first, but after taking a closer look at this example of creating loggers manually, and reading that Logger Factories just need to be a callable, I felt more comfortable with the idea of creating my own logger factory, and putting the journal.send() in the resulting logger.

In short, the value for logger_factory is just a callable. So you can supply a logger factory instance with a __call__ method. This is what LoggerFactory does: it just calls logger.getLogger(name), and automatically populates the name field for you. Alternatively, you can supply a logger class (i.e. pass in the class without the parentheses at the end ()). For example, you can do logger_factory=structlog.PrintLogger and the “factory” will create a new PrintLogger each time it is invoked.

Create a custom journald logger §

Taking a closer look at PrintLogger, all we really need to do is inherit from that class and override the msg() method with one that calls journal.send() instead of print. Though after doing so, we must also replace log, debug, info, warn and warning with our new msg method, because they need overriding too. This is also what PrintLogger does:

py_sysd_pkg/journald_logger.py
from systemd import journal
import structlog

class JournaldLogger(structlog.PrintLogger):
    def msg(self, msg_text, msg_dict):
        journal.send(msg_text, **msg_dict)

    log = debug = info = warn = warning = msg
    fatal = failure = err = error = critical = exception = msg

Note how our msg() method expects two arguments:

  • msg_text: The “message” that gets logged to journald.
  • msg_dict: A dictionary of uppercase keys (i.e. custom journald fields) to values.

Use a custom journald renderer §

We need to modify our JournaldRenderer to pass the log event in this format (and remove the calls to journal.send() from the renderer):

py_sysd_pkg/journald_renderer.py
from structlog.typing import WrappedLogger, EventDict

class JournaldRenderer:
    def __call__(self, logger: WrappedLogger, name: str, event_dict: EventDict):
        event_dict_upper = {
            key.upper(): value
            for key, value in event_dict.items()
            if key != "event"
        }

        return {"msg_text": event_dict["event"], "msg_dict": event_dict_upper}

The renderer is now much simpler. All it does is renames the event key to msg_text, and converts the rest of the keys to uppercase (as expected by journald) and packs them in msg_dict as a dictionary.

Configure logging to use custom logger and renderer §

Now we need to change the logging setup to support journald logging or console logging:

  • If journald, use JournaldRenderer() as the last processor (to output {"msg_text": "...", "msg_dict": {...}}), and JournaldLogger as the logger factory to send the logs entries to journald.
  • If console, use ConsoleRenderer() as the last processor (to output a string suitable for printing in the console), and structlog.PrintLoggerFactory() as the logger factory to use print to output the log entries.

Let’s replace the journald_only flag with the renderer option, which takes a string. Then we set the processors and logger factory as desired:

py_sysd_pkg/app_logging.py
import structlog
import logging

from py_sysd_pkg.journald_renderer import JournaldRenderer
from py_sysd_pkg.journald_logger import JournaldLogger

def logging_setup(renderer):
    processors = [
            structlog.processors.add_log_level,
            structlog.processors.TimeStamper(fmt="iso", utc=True),
            structlog.processors.StackInfoRenderer(),
        ]

    if renderer == "journald":
        processors.append(JournaldRenderer())
        logger_factory = JournaldLogger
    elif renderer == "console":
        processors.append(structlog.dev.ConsoleRenderer())
        logger_factory = structlog.PrintLoggerFactory()
    else:
        raise ValueError("Unknown renderer")

    structlog.configure(
        cache_logger_on_first_use=True,
        wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
        processors=processors,
        logger_factory=logger_factory,
    )

Allow configuring log renderer at launch §

Finally, let’s change main function to set renderer using the first command line argument. This way we can do poetry run service-run console to log to console, and poetry run service-run journald to log to journald:

py_sysd_pkg/service.py
import time
import structlog
import sys

from py_sysd_pkg.app_logging import logging_setup

def main():
    logging_setup(renderer=sys.argv[1])

    logger = structlog.get_logger()

    logger.info("starting up")
    try:
        while True:
            t = time.time()
            logger.info("example message", my_counter=t)
            time.sleep(1)
    except SystemExit:
        logger.info("shutting down")

This use of sys.argv is brittle as it does no error handling, but it’s good enough for testing purposes.

Testing §

Let’s run with console as the argument, and check that we only log to the console:

$ poetry run service-run console
2023-11-30T15:02:27.495584Z [info     ] starting up
2023-11-30T15:02:27.495672Z [info     ] example message                my_counter=1701356547.4956632
2023-11-30T15:02:28.497819Z [info     ] example message                my_counter=1701356548.4976702
2023-11-30T15:02:29.499876Z [info     ] example message                my_counter=1701356549.4998329
2023-11-30T15:02:30.501982Z [info     ] example message                my_counter=1701356550.5019195
2023-11-30T15:02:31.504781Z [info     ] example message                my_counter=1701356551.504495
^CTraceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/sam/projects/py_sysd_pkg/py_sysd_pkg/service.py", line 18, in main
    time.sleep(1)
KeyboardInterrupt

Running journalctl shows that nothing is being logged to journald, so we are only logging to the console. Great!

Now let’s try running with journald as the argument:

$ poetry run service-run journald
^CTraceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/sam/projects/py_sysd_pkg/py_sysd_pkg/service.py", line 18, in main
    time.sleep(1)
KeyboardInterrupt

Nothing coming out on the console. So far, so good. Let’s check journalctl:

$ journalctl
...
Nov 30 15:02:53 sam-virtual-machine python[57948]: starting up
Nov 30 15:02:53 sam-virtual-machine python[57948]: example message
Nov 30 15:02:54 sam-virtual-machine python[57948]: example message
Nov 30 15:02:55 sam-virtual-machine python[57948]: example message
Nov 30 15:02:56 sam-virtual-machine python[57948]: example message

Great, it’s logging to journald. Let’s check the JSON output to verify our custom fields are coming through:

$ journalctl -o json-pretty
...
{
        "SYSLOG_IDENTIFIER" : "python",
        "CODE_FILE" : "/home/sam/projects/py_sysd_pkg/py_sysd_pkg/journald_logger.py",
        "MESSAGE" : "example message",
        "TIMESTAMP" : "2023-11-30T15:02:56.833203Z",
        "LEVEL" : "info",
        "MY_COUNTER" : "1701356576.833088",
        # ...other fields remove for brevity...
}

Looks like it all works!

Summary §

In summary, the best approach for logging to journald with structlog is: