Code of the Day

Primitive and composite types

The atoms of a type system — how primitives map to machine words, and how composites are built from them and laid out in memory.

Data Types & Type Systems5 min read
By the end of this lesson you will be able to:
  • Distinguish primitive types from composite types and explain how they are laid out in memory
  • Explain why field ordering in a struct can affect performance

A type is a contract: it tells the compiler (or the runtime) what operations are legal on a value and how many bytes to reserve for it. Get the contract wrong and you get either a compile-time error or — worse — silent memory corruption at runtime. The type system is the first line of defence in any language, and understanding how it works shapes every design decision you make.

Primitives

Primitive types are the atoms of a type system. They map directly to what the CPU can manipulate: integers, floating-point numbers, booleans, and in many languages characters. Their size is fixed and machine-level:

Type        Typical size    Range / notes
---------   ------------    --------------------------------
int8        1 byte          −128 to 127
uint32      4 bytes         0 to 4 294 967 295
float64     8 bytes         ~15 decimal digits of precision
bool        1 byte          true / false
char (UTF8) 1–4 bytes       Unicode code point

These types correspond directly to registers and instructions on the CPU. An int32 addition compiles to a single ADD instruction; a float64 multiplication uses the floating-point unit. Primitives have no hidden overhead.

Composite types

are built from primitives (and from other composites). The most fundamental are:

  • Arrays / slices — a contiguous block of same-typed elements. Random access is O(1) because the offset of element i is always base_address + i × element_size.
  • Structs / records — a fixed-layout grouping of named fields, potentially of different types. Fields are packed in declaration order, subject to alignment padding.
  • — a value that stores a memory address. They let you build linked structures (trees, linked lists) and share data without copying it.
  • Sum types / tagged unions — a value that is one of several possible types, with a tag byte that says which one. Rust's enum and Haskell's algebraic data types work this way.

Why layout matters

Cache lines fetch 64 bytes at a time. A struct whose fields are ordered so that hot fields sit together fits in fewer cache lines and runs measurably faster than the same struct with fields scattered by size.

// Fields ordered by size — wastes space to alignment padding
struct Bad { char a; int b; char c; int d; }
// Actual layout: a(1) + padding(3) + b(4) + c(1) + padding(3) + d(4) = 16 bytes

// Fields grouped — packs tightly
struct Good { int b; int d; char a; char c; }
// Actual layout: b(4) + d(4) + a(1) + c(1) + padding(2) = 12 bytes

The second struct is 25% smaller — meaning 25% more records fit in a cache line. At scale this is a measurable difference in throughput.

Knowledge check

  1. 1.
    Why is random access to an array element O(1)?
  2. 2.
    The order of fields in a struct can affect how much memory it occupies due to alignment padding.
Finished reading? Mark it complete to track your progress.

On this page