Using structlog and journald
I was surprised to find practically nothing online regarding using structlog with journald. This seemed odd given how both are strong proponents of structured logging.
Update 2023-12-08: After writing this up, I did find this structlog issue, which seems to indicate there is interest in journald integration.
To skip straight to my example implementation, jump to the last section of these notes.
If you want to see how I got there, carry on reading. In the following notes I experiment with connecting structlog up to journald before converging onto a good solution. The end result is a Python application that uses structlog, and can be configured to send logs either to the console or journald. This allows the application to be run as a command line tool, or as a service under systemd.
Table of contents §
The three attempts are:
- 🤔 First attempt: logging to journald from structlog from inside the processor chain (works, but not ideal).
- 😐 Second attempt: fixing the double logging issue from the first attempt (still not ideal).
- 🙂 Third attempt: implementing a custom journald logger (much better).
First attempt: logging to journald from structlog from inside the processor chain §
Implementation §
Since I’m using systemd-python
, somewhere I need to call journal.send()
to send the logs to journald. The ideal place for that looks like it would be in a logger, which is created by a LoggerFactory. However I don’t quite have my head around how it all fits together, so let’s start by writing to journald from within the processor chain, since this looks like an easy way to get our hands on the log event.
So first let’s create a JournaldRenderer
. All it does is takes the incoming event_dict
, converts it into the right format for journald, then calls journal.send()
to write to journald. Then it passes the event_dict
unmodified on to the next processor in the chain. It’s a little bit sneaky, but it should work.
from structlog.typing import WrappedLogger, EventDict
from systemd import journal
class JournaldRenderer:
def __call__(self, logger: WrappedLogger, name: str, event_dict: EventDict):
event_dict_upper = {
key.upper(): value
for key, value in event_dict.items()
if key != "event"
}
journal.send(event_dict["event"], **event_dict_upper)
return event_dict
Next we need to call structlog.configure()
somewhere and add our JournaldRenderer
into the processors chain.
This is a very help example of a fast, production-ready use of structlog.configure()
. We’ll use it as the starting point. Though instead of JSONRenderer
we’ll use our JournaldRenderer
, and instead of using structlog.BytesLoggerFactory()
as the logger factory we’ll use structlog.PrintLoggerFactory()
, which means our output will get printed to the console.
import structlog
import logging
from py_sysd_pkg.journald_renderer import JournaldRenderer
def logging_setup():
structlog.configure(
cache_logger_on_first_use=True,
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
processors=[
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso", utc=True),
structlog.processors.StackInfoRenderer(),
JournaldRenderer(),
structlog.dev.ConsoleRenderer(),
],
logger_factory=structlog.PrintLoggerFactory(),
)
Then, in the very basic main loop for this application (created using these steps), we’ll call logging_setup()
and do a bit of logging.
import time
import structlog
from py_sysd_pkg.app_logging import logging_setup
def main():
logging_setup()
logger = structlog.get_logger()
logger.info("starting up")
try:
while True:
t = time.time()
logger.info("example message", my_counter=t)
time.sleep(1)
except SystemExit:
logger.info("shutting down")
Testing §
When run from the CLI, we can see the logs are formatted nicely by the logger which our PrintLoggerFactory()
has created:
$ poetry run service-run
2023-11-30T13:32:38.498161Z [info ] starting up
2023-11-30T13:32:38.498610Z [info ] example message my_counter=1701351158.4986005
2023-11-30T13:32:39.501073Z [info ] example message my_counter=1701351159.5009449
2023-11-30T13:32:40.503285Z [info ] example message my_counter=1701351160.5032387
2023-11-30T13:32:41.505230Z [info ] example message my_counter=1701351161.505109
2023-11-30T13:32:42.507554Z [info ] example message my_counter=1701351162.5074105
^CTraceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/squirrel/projects/py_sysd_pkg/py_sysd_pkg/service.py", line 17, in main
time.sleep(1)
KeyboardInterrupt
Then if we use journalctl
, we can see it is logging to journald too:
Nov 30 13:33:03 squirrel-virtual-machine systemd[1]: Started Foo.
Nov 30 13:33:03 squirrel-virtual-machine python[50822]: starting up
Nov 30 13:33:03 squirrel-virtual-machine python[50822]: example message
Nov 30 13:33:03 squirrel-virtual-machine service-run[50822]: 2023-11-30T13:33:03.441396Z [info ] starting up
Nov 30 13:33:03 squirrel-virtual-machine service-run[50822]: 2023-11-30T13:33:03.441794Z [info ] example message my_counter=1701351183.4417818
Nov 30 13:33:04 squirrel-virtual-machine python[50822]: example message
Nov 30 13:33:04 squirrel-virtual-machine service-run[50822]: 2023-11-30T13:33:04.443537Z [info ] example message my_counter=1701351184.443479
Nov 30 13:33:05 squirrel-virtual-machine python[50822]: example message
Nov 30 13:33:05 squirrel-virtual-machine service-run[50822]: 2023-11-30T13:33:05.445379Z [info ] example message my_counter=1701351185.4453325
Nov 30 13:33:06 squirrel-virtual-machine python[50822]: example message
Nov 30 13:33:06 squirrel-virtual-machine service-run[50822]: 2023-11-30T13:33:06.447454Z [info ] example message my_counter=1701351186.447395
Nov 30 13:33:07 squirrel-virtual-machine python[50822]: example message
Nov 30 13:33:07 squirrel-virtual-machine service-run[50822]: 2023-11-30T13:33:07.450095Z [info ] example message my_counter=1701351187.4500322
Notice anything wrong?
This is a good first pass, but has a signficant issue: everything gets logged to journald twice, the latter of which is unstructured. This is because we are explicitly writing to journald with journal.send()
, and then journald is helpfully collecting the logs from our logger created by PrintLoggerFactory()
, which are unstructured.
Second attempt: fixing the double logging §
Let’s build upon the previous work, but fix the double logging.
Implementaton §
First, we’ll update JournaldRenderer
to add a journald_only
option. If true, this will raise structlog.DropEvent
after sending the log to journald, which will silently drop that log entry.
This way the log entry will always go to journald, but we can toggle whether logs make it to the console. This is not very efficient, but gets us a step closer (structlog has a clever solution for efficient dropping based on log level, which explains more about why filtering in this manner is not super efficient).
import structlog
from structlog.typing import WrappedLogger, EventDict
from systemd import journal
class JournaldRenderer:
def __init__(self, journald_only=False):
self.journald_only = journald_only
def __call__(self, logger: WrappedLogger, name: str, event_dict: EventDict):
event_dict_upper = {
key.upper(): value
for key, value in event_dict.items()
if key != "event"
}
journal.send(event_dict["event"], **event_dict_upper)
if self.journald_only:
raise structlog.DropEvent
return event_dict
Next we need to update our configure code to pass the journald_only
option through. This is just plumbing:
import structlog
import logging
from py_sysd_pkg.journald_renderer import JournaldRenderer
def logging_setup(journald_only=False):
structlog.configure(
cache_logger_on_first_use=True,
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
processors=[
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso", utc=True),
structlog.processors.StackInfoRenderer(),
JournaldRenderer(journald_only=journald_only),
structlog.dev.ConsoleRenderer(),
],
logger_factory=structlog.PrintLoggerFactory(),
)
Then in our main application we can hard-code the journald_only
to true or false as needed. (We’ll move it out to be configurable later):
import time
import structlog
from py_sysd_pkg.app_logging import logging_setup
def main():
logging_setup(journald_only=True)
logger = structlog.get_logger()
logger.info("starting up")
try:
while True:
t = time.time()
logger.info("example message", my_counter=t)
time.sleep(1)
except SystemExit:
logger.info("shutting down")
We’re now ready to test again.
Testing §
With journald_only=True
, now nothing gets logged to the console:
$ poetry run service-run
^CTraceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/squirrel/projects/py_sysd_pkg/py_sysd_pkg/service.py", line 17, in main
time.sleep(1)
KeyboardInterrupt
This is good.
If we use journalctl
to view journald logs, we do see logs.
Nov 30 13:50:32 squirrel-virtual-machine systemd[1]: Started Foo.
Nov 30 13:50:32 squirrel-virtual-machine sudo[52445]: pam_unix(sudo:session): session closed for user root
Nov 30 13:50:32 squirrel-virtual-machine python[52449]: starting up
Nov 30 13:50:32 squirrel-virtual-machine python[52449]: example message
Nov 30 13:50:33 squirrel-virtual-machine python[52449]: example message
Nov 30 13:50:34 squirrel-virtual-machine python[52449]: example message
Nov 30 13:50:35 squirrel-virtual-machine python[52449]: example message
Nov 30 13:50:36 squirrel-virtual-machine python[52449]: example message
Nov 30 13:50:37 squirrel-virtual-machine python[52449]: example message
Nov 30 13:50:38 squirrel-virtual-machine python[52449]: example message
Nov 30 13:50:39 squirrel-virtual-machine python[52449]: example message
Nov 30 13:50:40 squirrel-virtual-machine python[52449]: example message
Nov 30 13:50:41 squirrel-virtual-machine python[52449]: example message
Nov 30 13:50:41 squirrel-virtual-machine systemd[1]: Stopping Foo...
Nov 30 13:50:41 squirrel-virtual-machine systemd[1]: foo.service: Deactivated successfully.
Nov 30 13:50:41 squirrel-virtual-machine systemd[1]: Stopped Foo.
Also good.
We have fixed the double logging issue, but the implementation is not ideal:
- The logging is done from within the processor chain, which still seems wrong.
- We can only toggle whether or not logs go to the console. Logs always go to journald. It would be nice to make this more flexible.
Third attempt: implementing a custom journald logger §
Let’s try and do things properly this time.
Implementation §
Putting the calls to journal.send()
inside JournaldRenderer
within the list of processors feels like the wrong approach. The processors are intended to manipulate the log event, not perform I/O. That should be done by the logger.
It wasn’t clear to me at first, but after taking a closer look at this example of creating loggers manually, and reading that Logger Factories just need to be a callable, I felt more comfortable with the idea of creating my own logger factory, and putting the journal.send()
in the resulting logger.
In short, the value for logger_factory
is just a callable. So you can supply a logger factory instance with a __call__
method. This is what LoggerFactory
does: it just calls logger.getLogger(name)
, and automatically populates the name
field for you. Alternatively, you can supply a logger class (i.e. pass in the class without the parentheses at the end ()
). For example, you can do logger_factory=structlog.PrintLogger
and the “factory” will create a new PrintLogger
each time it is invoked.
Create a custom journald logger §
Taking a closer look at PrintLogger
, all we really need to do is inherit from that class and override the msg()
method with one that calls journal.send()
instead of print
. Though after doing so, we must also replace log
, debug
, info
, warn
and warning
with our new msg
method, because they need overriding too. This is also what PrintLogger
does:
from systemd import journal
import structlog
class JournaldLogger(structlog.PrintLogger):
def msg(self, msg_text, msg_dict):
journal.send(msg_text, **msg_dict)
log = debug = info = warn = warning = msg
fatal = failure = err = error = critical = exception = msg
Note how our msg()
method expects two arguments:
msg_text
: The “message” that gets logged to journald.msg_dict
: A dictionary of uppercase keys (i.e. custom journald fields) to values.
Use a custom journald renderer §
We need to modify our JournaldRenderer
to pass the log event in this format (and remove the calls to journal.send()
from the renderer):
from structlog.typing import WrappedLogger, EventDict
class JournaldRenderer:
def __call__(self, logger: WrappedLogger, name: str, event_dict: EventDict):
event_dict_upper = {
key.upper(): value
for key, value in event_dict.items()
if key != "event"
}
return {"msg_text": event_dict["event"], "msg_dict": event_dict_upper}
The renderer is now much simpler. All it does is renames the event
key to msg_text
, and converts the rest of the keys to uppercase (as expected by journald) and packs them in msg_dict
as a dictionary.
Configure logging to use custom logger and renderer §
Now we need to change the logging setup to support journald logging or console logging:
- If journald, use
JournaldRenderer()
as the last processor (to output{"msg_text": "...", "msg_dict": {...}}
), andJournaldLogger
as the logger factory to send the logs entries to journald. - If console, use
ConsoleRenderer()
as the last processor (to output a string suitable for printing in the console), andstructlog.PrintLoggerFactory()
as the logger factory to useprint
to output the log entries.
Let’s replace the journald_only
flag with the renderer
option, which takes a string. Then we set the processors and logger factory as desired:
import structlog
import logging
from py_sysd_pkg.journald_renderer import JournaldRenderer
from py_sysd_pkg.journald_logger import JournaldLogger
def logging_setup(renderer):
processors = [
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso", utc=True),
structlog.processors.StackInfoRenderer(),
]
if renderer == "journald":
processors.append(JournaldRenderer())
logger_factory = JournaldLogger
elif renderer == "console":
processors.append(structlog.dev.ConsoleRenderer())
logger_factory = structlog.PrintLoggerFactory()
else:
raise ValueError("Unknown renderer")
structlog.configure(
cache_logger_on_first_use=True,
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
processors=processors,
logger_factory=logger_factory,
)
Allow configuring log renderer at launch §
Finally, let’s change main function to set renderer
using the first command line argument. This way we can do poetry run service-run console
to log to console, and poetry run service-run journald
to log to journald:
import time
import structlog
import sys
from py_sysd_pkg.app_logging import logging_setup
def main():
logging_setup(renderer=sys.argv[1])
logger = structlog.get_logger()
logger.info("starting up")
try:
while True:
t = time.time()
logger.info("example message", my_counter=t)
time.sleep(1)
except SystemExit:
logger.info("shutting down")
This use of sys.argv
is brittle as it does no error handling, but it’s good enough for testing purposes.
Testing §
Let’s run with console
as the argument, and check that we only log to the console:
$ poetry run service-run console
2023-11-30T15:02:27.495584Z [info ] starting up
2023-11-30T15:02:27.495672Z [info ] example message my_counter=1701356547.4956632
2023-11-30T15:02:28.497819Z [info ] example message my_counter=1701356548.4976702
2023-11-30T15:02:29.499876Z [info ] example message my_counter=1701356549.4998329
2023-11-30T15:02:30.501982Z [info ] example message my_counter=1701356550.5019195
2023-11-30T15:02:31.504781Z [info ] example message my_counter=1701356551.504495
^CTraceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/squirrel/projects/py_sysd_pkg/py_sysd_pkg/service.py", line 18, in main
time.sleep(1)
KeyboardInterrupt
Running journalctl
shows that nothing is being logged to journald, so we are only logging to the console. Great!
Now let’s try running with journald
as the argument:
$ poetry run service-run journald
^CTraceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/squirrel/projects/py_sysd_pkg/py_sysd_pkg/service.py", line 18, in main
time.sleep(1)
KeyboardInterrupt
Nothing coming out on the console. So far, so good. Let’s check journalctl
:
$ journalctl
...
Nov 30 15:02:53 squirrel-virtual-machine python[57948]: starting up
Nov 30 15:02:53 squirrel-virtual-machine python[57948]: example message
Nov 30 15:02:54 squirrel-virtual-machine python[57948]: example message
Nov 30 15:02:55 squirrel-virtual-machine python[57948]: example message
Nov 30 15:02:56 squirrel-virtual-machine python[57948]: example message
Great, it’s logging to journald. Let’s check the JSON output to verify our custom fields are coming through:
$ journalctl -o json-pretty
...
{
"SYSLOG_IDENTIFIER" : "python",
"CODE_FILE" : "/home/squirrel/projects/py_sysd_pkg/py_sysd_pkg/journald_logger.py",
"MESSAGE" : "example message",
"TIMESTAMP" : "2023-11-30T15:02:56.833203Z",
"LEVEL" : "info",
"MY_COUNTER" : "1701356576.833088",
# ...other fields remove for brevity...
}
Looks like it all works!
Summary §
In summary, the best approach for logging to journald with structlog is:
- Create a custom journald logger, which calls
journal.send()
. - Create a custom journald renderer, which prepares the log entries for
journal.send()
. - Configure the logging to use our custom logger and renderer.
- (Optionally) If you want to be able to switch between logging to journald or the console, add a command line option to switch between the two.