Skip to content

Floating-Point Types

Zig has floating-point types for numbers with fractional parts.

Zig has floating-point types for numbers with fractional parts.

const x: f32 = 3.5;
const y: f64 = 3.5;

f32 is a 32-bit floating-point number.
f64 is a 64-bit floating-point number.

Use f64 for ordinary calculations unless there is a reason to use less space.

const pi: f64 = 3.141592653589793;

A floating-point literal contains a decimal point or an exponent.

const a = 1.0;
const b = 1.5;
const c = 1e6;
const d = 2.5e-3;

The exponent form means “times a power of ten.”

1e6    = 1000000
2.5e-3 = 0.0025

Arithmetic works as expected.

const std = @import("std");

pub fn main() void {
    const width: f64 = 12.5;
    const height: f64 = 4.0;

    const area = width * height;

    std.debug.print("{d}\n", .{area});
}

The output is:

50

Floating-point numbers are approximations. They cannot represent every decimal number exactly.

const a: f64 = 0.1;
const b: f64 = 0.2;
const c = a + b;

The value of c is close to 0.3, but it may not be exactly 0.3.

Do not compare floating-point results for exact equality unless the values are known to be exact.

if (c == 0.3) {
    // usually the wrong test
}

Use a tolerance.

const std = @import("std");

fn nearlyEqual(a: f64, b: f64, eps: f64) bool {
    return @abs(a - b) <= eps;
}

pub fn main() void {
    const x: f64 = 0.1 + 0.2;

    if (nearlyEqual(x, 0.3, 0.000001)) {
        std.debug.print("close enough\n", .{});
    }
}

Integer and floating-point values do not mix silently.

const a: i32 = 10;
const b: f64 = 2.5;

const c = a + b; // error

Convert explicitly.

const a: i32 = 10;
const b: f64 = 2.5;

const c = @as(f64, @floatFromInt(a)) + b;

To convert a float to an integer, use an explicit conversion.

const x: f64 = 3.75;
const n: i32 = @intFromFloat(x);

This drops the fractional part. Here n is 3.

This conversion is only valid when the result fits in the destination integer type.

Floating-point division produces a floating-point result.

const x: f64 = 7.0 / 2.0;

The value is 3.5.

Integer division is different.

const y = @divTrunc(7, 2);

The value is 3.

Use the operation that matches the data.

Floating-point values have special cases.

const inf = std.math.inf(f64);
const nan = std.math.nan(f64);

Infinity represents a value larger than any finite f64.

NaN means “not a number.” It is used for invalid floating-point results.

NaN is not equal to itself.

const x = std.math.nan(f64);

if (x == x) {
    // false
}

Use library functions when you need to test for these values.

if (std.math.isNan(x)) {
    std.debug.print("nan\n", .{});
}

Zig also provides compile-time floating-point types and wider floating-point types on targets that support them, but most programs begin with f32 and f64.

Use f32 when storage size or external format requires it.

const sample: f32 = 0.5;

Use f64 when accuracy matters more than size.

const distance: f64 = 12345.6789;

Exercises:

  1. Declare two f64 values and print their sum.

  2. Write a function average that takes two f64 values and returns their average.

  3. Convert an i32 to f64 and multiply it by 2.5.

  4. Convert 3.75 to an integer and print the result.

  5. Test whether 0.1 + 0.2 is close to 0.3 using a tolerance.