Zig: First impressions
I have been working through the excellent Crafting Interpreters by Robert Nystrom. Specifically, “part III: A Bytecode Virtual Machine”. For this part the book uses C, but I decided to use Zig. I’ve never used Zig before. Following are some rough notes from my first experience of using Zig.
Top-level §
🚀 Cool stuff:
- Private by default: Functions are private by default. Must add
pub
to make them public. - Explicit casts: All casts are explicit, e.g. use
@intFromEnum()
to convert an enum to an int. (See this blog post for more details). - No unused variables: Return values and function arguments must be used (or explicitly unused with
_ = value;
). - Only writeable if necessary: Variables are either
const
(read-only) orvar
(can be modified). Unnecessarily usingvar
is a compile error. - Handle all cases: In a switch statement, the compiler requires all cases to be handled. The keyword
else
can be used to catch all remaining cases, but it is a compile error to useelse
unnecessarily. - Unambiguous arithmetic: Something as simple as dividing a signed integer can be ambiguous. Should
-5 / 2
equal-2
or-3
? Zig disallows/
in such cases, and provides@divFloor
and@divTrunc
instead. - Better pointers: Directly using pointers is possible and fairly commonplace. Null pointers are not permitted, instead optionals provide a safe alternative.
⚖️ Neutral:
- Fixed, mandatory arguments: A function has only one definition (no overloading), and the number of arguments is fixed (no variadics). A visible implication is that printing a format string without any arguments still necessitates an empty struct, e.g.
std.debug.print("Hello, world!", .{})
where.{}
is an empty struct. - Namespaced imports: It’s easy to use
@import
to include part or all of another Zig source file, and this respects whether items are marked aspub
(public), but importing multiple items from one source file feels a little awkward. - Occasionally confusing error messages: It’s easy to accidentally write code that triggers confusing error messages from the compiler, but perhaps this becomes easier to understand with more experience? It helps to pay close attention to any
NOTE
within the error message, but this is not always enough. - No closures: Zig does not support closures because it would be too easy to accidentally close over a pointer and clobber the stack, and closures would often be half-
comptime
, which Zig does not have a good way of handling. - Workaround for inline functions: Zig does not support inline functions, but a workaround is possible, though it requires a bit of verbosity.
💢 Pain points
- No nice way to do designated initialisers: See this issue.
- No particlarly nice way to do partial initialisation: See this issue.
- No way to include metadata with an error: Errors are just enums, and can include no additional metadata. So if you have a
SyntaxError
when parsing a file, the error cannot store upon which line the syntax error occurred. - Outdated information online: Zig is in beta, and significant backwards-incompatible changes are still par for the course. It’s common to find answers or blog posts online that are outdated. For example, at some point recently
@enumToInt()
was renamed to@intFromEnum()
.
Basics §
First let’s run through a “Hello, world!” example and some other basics of Zig.
Hello world §
Without further ado:
Note that the empty struct .{}
is necessary due to the fixed, mandatory arguments requirement.
If omitted Zig will give the following compiler error:
Note the suggestion to use -freference-trace
can be quite useful when the call stack is longer.
Importing §
Some more examples of importing:
This could be rewritten as:
This feels a bit awkward coming from Python’s from module import x, y, z
syntax, or from Rust’s use
syntax, but is still a big improvement from C’s lack of namespacing. See the Zig docs for details.
Fixed, mandatory arguments §
Zig doesn’t have function overloading, and it doesn’t have variadic functions (functions with an arbitrary number of arguments). But it does have a compiler capable of creating specialized functions based on the types passed, including types inferred and created by the compiler itself.
This can be a pain at first, but you soon learn to append .{}
when calling print
without any arguments.
Setting up ZLS can go a long way to improve the quality of life in this area, since it shows the function arguments in the calling code.
Anonymous struct §
In the absence of variadic arguments, it is common to instead of an anonymos struct to pass a variable number of arguments. This can be seen in the print
function.
The anonymous struct is an unnamed struct which is defined with an opening .{
and a closing }
. Arbitrary fields can be listed inside.
There was a proposal to simplify the syntax from .{}
to just {}
, but that seems unlikely to happen due to the complications it will cause with parsing edge cases. Think along the same lines as the Python empty set issue.
Printing §
An example of using print
, but this time with some arguments in the anonymous struct:
Logging §
Using std.debug.print
is a quick way to get started, but has limitations because it makes certain choices for us, e.g. it prints to stderr, implements a lock, and errors get discarded. In some cases this can result in std.debug.print
failing with the following error:
A quick fix is to use std.log.debug
instead, though note that a newline is appended automatically. See this gist (Zig v0.12.0+) for more details, or take a look at the Zig log source.
Comments §
There are three types of comments:
- Use
//
for normal comments. - Use
///
for doc-comments, e.g. for documenting a struct, function or enum. - Use
//!
for top-level doc-comments, i.e. for documenting the module.
It is a compiler error to use ///
or //!
in the wrong place.
There are no multiline comments in Zig (e.g. like
/* */
comments in C). This allows Zig to have the property that each line of code can be tokenized out of context.
Maths §
Divison §
When dividing a signed integer, the programmer may make assumptions about which direction the result is rounded (e.g. towards zero or towards negative infinity). To avoid this ambiguity, Zig disallows using /
for signed integer division:
Instead, Zig requires that you choose explicitly:1
@divFloor
- Rounds towards zero:@divFloor(-5, 2) = -2
@divTrunc
- Rounds towards negative infinity:@divFloor(-5, 2) = -3
@divExact
- Assumes no rounding required, for when there is guaranteed to be no remainder.
So, to fix the above example, it probably makes most sense to always round to negative infinity so that “negative” seconds (those before 1970-01-01) are treated the same as “positive” seconds (those after):
Float to integer §
Use @floatFromInt
:
Types §
Let’s look at various data types and structures we have at our disposal.
Enums §
Enums are straightforward:
Structs §
Union §
Slices §
Slices can be thought of as a pair of
[*]T
(the pointer to the data) and ausize
(the element count). Their syntax is[]T
, withT
being the child type. Slices are used heavily throughout Zig for when you need to operate on arbitrary amounts of data. Slices have the same attributes as pointers, meaning that there also exists const slices. For loops also operate over slices.
Slices are very useful for passing around a window into an array. It is also very handy that they know their own length, whereas in C you would have to pass that separately.
Strings §
There is not really any such thing as a string in Zig. Strings in the code coerce to []const u8
, i.e. an array of 8-bit unsigned bytes, and are null terminated (for ease of compatibility with C).
String literals in Zig coerce to
[]const u8
.
Pointers §
While using “raw pointers in Rust is uncommon”, they seem more commonplace in Zig. The syntax is different to C, but similar enough that it soon feels familiar. As a helpful addition, Zig does not permit “0 or null as a value” for pointers; instead, optionals should be used to represent “null” pointers.
Referencing is done with
&variable
, and dereferencing is done withvariable.*
.
Optionals §
Use ?
for optional variables.
This is how null pointers in Zig work - they must be unwrapped to a non-optional before dereferencing, which stops null pointer dereferences from happening accidentally.
Optionals can be unwrapped using Payload Captures.
If you’re sure the unwrap is safe, you can use .?
, e.g.:
However, this will cause a panic if you’re wrong. For example, if we swap around the assignment lines:
Undefined is not the same as null §
Use undefined
to leave a variable uninitialised.
undefined
can be coerced to any type. Once this happens, it is no longer possible to detect that the value isundefined
.undefined
means the value could be anything, even something that is nonsense according to the type. Translated into English,undefined
means “Not a meaningful value. Using this value would be a bug. The value will be unused, or overwritten before being used.”Zig Language Reference 0.13.0 Documentation - Values - Assignment - undefined.
Memory management §
I need to give this one some more thought, but in short: Zig does not guarantee memory safety, but enables and encourages safe memory management practices.
Unlike Rust, there is no explicit unsafe
keyword; but unlike C
, the compiler will prevent many common mistakes.
Memory allocators §
Memory allocation in Zig is explicit, i.e. you must create an allocator, and use that to allocate memory. It is often necessary to pass the alloctor around.
Zig provides several memory allocators. The appropriate one will depend upon your needs.
Allocator example §
Following is an example of creating a general purpose memory allocator, using it to parse the arguments (e.g. argc
and argv
in C; sys.argv
in Python), and then switching based on the arguments:
Passing the allocator §
In general, pass the allocator by value rather than by pointer (i.e. prefer Allocator
over *Allocator
).2
Flow control §
Unreachable §
The unreachable
keyword is pretty neat:
Deferring §
defer
§
Use defer
to execute a statement when the current block exits:
Defer is useful to ensure that resources are cleaned up when they are no longer needed. Instead of needing to remember to manually free up the resource, you can add a defer statement right next to the statement that allocates the resource.
When there are multiple defers in a single block, they are executed in reverse order.
errdefer
§
Similar to defer
is errdefer
. The key difference is errdefer
evaluates the deferred expression iff an error is subsequently returned from within that block.
Return values §
All return values must be handled. To ignore a return, assign it to _
. For example:
The same goes for unused function arguments. It is a compile error if they go unused:
For loops §
For loops are used to iterate over arrays.
It’s easy to loop over an array, e.g. a string character-by-character:
However if you want a free-form traditional for (int i = 0; i < 10; i++)
style loop, you will need to use a while
loop.
While loops §
As well as a traditional while (condition)
loop, while
loops can act like traditional for
loops:
Exiting §
Zig handles Ctrl-C by default (i.e. it will kill the program). Though these comments (from March 2023) indicate that defer
functions do not run in this case. I’m not sure if that’s still true.
Support for catching signals on Linux should be possible with sigaction
, though I have not tried this. Here is an example.
It used to be possible to exit with std.os.exit(64)
to return 64
as the exit code. However, that has been removed since it is not cross-platform:
On Plan 9 for example, you return a string as the exit status instead of an integer.
However, today you can use std.process.exit(64)
to exit (on posix systems).
Error handling §
In Zig, error-handling is a first-class citizen, but is a little different to languages like Python.
Error overview §
Errors in Zig are essentially a special enum. The enum is shared globally, and all errors are part of that same enum.
As the programmer, you can defied error sets, which are useful for specifying which errors a function may return.
Since errors are just an enum, they cannot contain any additional metadata.
Errors are values §
Errors are returned using return
, the same as regular values.
There are no exceptions in Zig; errors are values.
Use try
xor catch
to handle errors §
A common pattern in other languages is try <FUNC> catch <ERR>
. That does not happen in Zig. You either:
- Use
catch <DEFAULT>
to return a default value upon error - Use
catch |err|
to do something with the error - Use
try <FUNC>
to pass the error up. - Use
if / else / switch
for more flexible error handling.
Use catch <default>
to return a default upon error §
If doSomething()
returns an error, x
will be set to 23
:
Use catch |err|
to do something with the error §
Use try x
to pass the error up §
try x
is a shortcut forx catch |err| return err
, and is commonly used where handling an error isn’t appropriate. Zig’stry
andcatch
are unrelated to try-catch in other languages.
In other words, calling try x
will call return err
if x
returns an error.
Use if / else / switch
for more flexible error handling §
Prepend !
to result if it can be an error §
note: function cannot return an error
If a function can return an error, then it must have !
before the return type, e.g. !void
, !u32
, !MyStruct
.
Otherwise you will get an error like this:
Use try
for calling functions that can return an array §
error: cannot format error union without a specifier
If a function can return an error, then it must be called with try
, e.g. try myFunction();
.
Otherwise you wil get an error like this:
The root cause can be tricky to track down because the error does not reference then line that caused it3.
In my case, this was the fix:
Recursive errors §
Often you can rely upon the inferred error set, e.g. just prepend !
to the return type and Zig will figure out the error set.
Sometimes Zig may not be able to figure this out. You will get this error:
unable to resolve inferred error set
This can happen due to recursive functions.
See this Zig issue which aims to add support for inferring error sets automatically for recursive functions.
As quick workaround, prepend anyerror!
instead of just !
. This catches all errors.
Combining error sets §
Error sets can be combined:
An example error §
Given this Zig code:
The following error arises:
The stack trace tells us that:
- On line 71, the function should return a
vm.VM
type, but we found thattry vm.defineNative
returns a@typeInfo(@typeInfo(@TypeOf(vm.VM.defineNative)).Fn.return_type.?).ErrorUnion.error_set
type.- This long return type of chained
@typeInfo
and@TypeOf
calls is fairly typical. The key part is theErrorUnion.error_set
at the end, which indicates this function returns an error set.
- This long return type of chained
- On line 54, not much to note.
- On line 62, we see that
init
should return aVM
.- Crucially, it cannot return an error because we did not specify
!VM
. - This aligns with line 71, which tells us
try vm.defineNative
is trying to return an error.
- Crucially, it cannot return an error because we did not specify
So despite the apparently complex error message, the solution is simply to change the return type for init
to !VM
.
Misc §
Following are various bit and bobs that didn’t fit in earlier categories.
Shadowing §
You can use @"type"
syntax to permit using reserved keywords as variable names or arguments.
For example, ordinarly you cannot use type
as an argument name:
The above will give the following error:
However, as the error message indicates, you can use @"type"
to disambiguate:
Function name §
Cannot use error
as function name, e.g.
Reading from stdin §
Use readUntilDelimiterOrEof
to read from stdin.
Many older examples specify readLineSlice
but that has been removed.
Ternary operator §
There is no ternary operator (e.g. cond ? x : y
), but you can do if (cond) x else y
, which is essentially the same.
Bit-shifting §
If values are not known at comptime, bit-shifting with <<
or >>
can produce confusing results, e.g.:
For more details seee this issue.
The current best approach appears to be to use std.math.shl
and std.math.shr
instead of <<
and >>
:
Inline functions §
Inline functions are not directly supported, but a workaround makes them possible.
Ideally we define an inline function like this, but it does not work:
The workaround is to wrap the function within a struct:
Thanks to Ralph Brorsen for sharing that trick.
Max values §
Use std.math.maxInt()
to get the max value of an integer.
For example:
Is equivalent to:
Building §
Zig has a built-in build system. When you zig init
a new project, Zig creates a boilerplace build.zig
, which can be used as both a library and an executable. The library uses root.zig
while the executable uses main.zig
. If your project is just for an executable, then you should modify build.zig
to delete the section referencing root.zig
(and vice versa if it’s just for a library).
Running zig build
will perform the build. By default the build outputs get written to zig-out/
, which is a directory adjacent to the build.zig
. For example, the executable will get written to zig-out/bin/MyApp
.
C interop §
A significant selling point for Zig is its ability to interop with C. Since I didn’t need to interop with C while working through Crafting Interpreters, I haven’t used any of these features yet, but thought it was significant enough to call out. See the Zig docs for more details.
Conclusion §
I’ve quickly become a fan of Zig. It feels like C without the sharp edges. It guides you down the right path, but does not prevent you from making mistakes. Due to the nature of its beta development phase, it’s currently a bit of a moving target, so doesn’t feel like a stable base yet for a long-term project (unless you are committed to keep updating your codebase as the language evolves). But I like Zig’s philosophy, and am keen to use in more in future.
If I had to summarise Zig, I’d call it “C without surprises”.
I’ll finish with the zen of Zig:
See this article by Andrew Kelley, Zig’s author, and this reddit comment for more details. ↩︎
Via this issue I found this commit when this information appears to have been removed from Zig. ↩︎