A Zig string is a sequence of bytes.
This is a string literal:
const s = "hello";It has five visible characters:
h e l l oIt also has a zero sentinel after the last byte, so the literal can be used where sentinel-terminated data is required.
The bytes can be printed one by one:
const std = @import("std");
pub fn main() void {
const s = "hello";
for (s) |b| {
std.debug.print("{d}\n", .{b});
}
}The output is:
104
101
108
108
111These are byte values. The letter h is byte 104. The letter e is byte 101.
To print them as characters, use {c}:
const std = @import("std");
pub fn main() void {
const s = "hello";
for (s) |b| {
std.debug.print("{c}\n", .{b});
}
}The output is:
h
e
l
l
oA string literal is not a special string object. Zig has no hidden string class. A string literal is a pointer to a constant sentinel-terminated array of bytes.
In ordinary code, it is often used as a slice:
const s: []const u8 = "hello";The type []const u8 means a read-only slice of bytes.
This is the most common string type in Zig.
const std = @import("std");
fn printString(s: []const u8) void {
std.debug.print("{s}\n", .{s});
}
pub fn main() void {
printString("zig");
printString("language");
}The output is:
zig
languageThe {s} format prints a byte slice as a string.
Since strings are bytes, s.len gives the number of bytes, not the number of human characters.
const std = @import("std");
pub fn main() void {
const s = "hello";
std.debug.print("{d}\n", .{s.len});
}The output is:
5For plain ASCII text, the number of bytes and the number of characters are the same.
For UTF-8 text, they may differ.
const std = @import("std");
pub fn main() void {
const s = "é";
std.debug.print("{d}\n", .{s.len});
}The output is:
2The character é is encoded as two bytes in UTF-8.
This is important. Indexing a string gives a byte, not a character.
const std = @import("std");
pub fn main() void {
const s = "é";
std.debug.print("{d}\n", .{s[0]});
std.debug.print("{d}\n", .{s[1]});
}The output is:
195
169These are the two UTF-8 bytes for é.
For byte-oriented work, this is exactly what you want. Files, network protocols, and memory buffers are byte sequences.
For text-oriented work, you must decode UTF-8 deliberately.
String literals may contain escapes:
const newline = "first\nsecond";
const tab = "a\tb";
const quote = "he said \"zig\"";
const slash = "c:\\tmp\\file.txt";A string may also be written across several lines with backslash-backslash syntax:
const text =
\\first line
\\second line
\\third line
;This produces the bytes for:
first line
second line
third lineMulti-line strings are useful for help text, generated source, and test data.
A mutable string needs mutable storage. A string literal is constant and must not be changed.
var buf = [_]u8{ 'h', 'e', 'l', 'l', 'o' };
buf[0] = 'H';Now buf contains:
HelloTo pass it to a function that expects a string slice, use slicing:
const std = @import("std");
pub fn main() void {
var buf = [_]u8{ 'h', 'e', 'l', 'l', 'o' };
buf[0] = 'H';
std.debug.print("{s}\n", .{buf[0..]});
}The output is:
HelloUse []const u8 for read-only strings. Use []u8 for mutable byte buffers.
Exercises.
Exercise 6-17. Write a program that prints the byte values of "zig".
Exercise 6-18. Write a function that takes []const u8 and prints each byte as a character.
Exercise 6-19. Print the .len of "hello" and "é".
Exercise 6-20. Create a mutable byte array containing hello, change it to Hello, and print it.