Python: how (not) to reset a dataclass
In Python, it can be useful to reset a dataclass back to its initial values, but this can be easy to get subtly wrong.
Resetting a dataclass (the wrong way) §
Can you spot the mistake?
Given this simple dataclass:
We follow these steps to modify and then (try) to reset it:
Create an instance of the
MyData
dataclass.Modify both the
widgets
andissues
fields:Check the fields have changed:
(Attempt to) reset the dataclass by assigning a new instance:
Check the fields have been reset:
What just happened? The widgets
field reset but the issues
field did not!
I’ll give you a moment to think about it.
🕛…
🕐…
🕑…
🕒…
🕒…
🕔…
🕕…
🕖…
🕗…
🕘…
🕙…
🕚…
🕛…Okay!
The problem §
The problem is that the default value for issues
(a list) is mutable.
The Python docs on dataclasses has a whole section about this, with some great examples.
The key piece of information is:
Python stores default member variable values in class attributes.
The default value is a class attribute, which gets shared between all instances of the class. So the class attribute does not get reset because all instance of the MyData
class share the same issues
attribute.1
Why typing is good §
Did you know that type annotations will catch this issue?
If you annotate issues: list = []
then Python will helpfully raise a ValueError
!
There’s a good reason to continue adding type hints to your Python code!
The solution §
As the ValueError
tells us, use a default_factory
as follows:
Then if we run the same steps as before:
All good!
Yes, this is very similar to why you should not use mutable default arguments in Python functions. They get shared too. ↩︎