Language Tour
This guide walks through every wire format feature of wirespec, in the order you would typically reach for them. Each section explains what a feature does, shows a real example, and describes the corresponding generated C.
If you are new to wirespec, read Getting Started first to set up the compiler. State machines and modules are covered in their own guides.
1. Primitive Types
wirespec has the standard set of unsigned and signed integer types. The default representation is native-width: u8 is one byte, u16 is two bytes, and so on. When no endianness annotation is present on the module, multi-byte fields default to the module's @endian setting (or big-endian if unspecified).
packet Example {
a: u8, # 1 byte, unsigned
b: u16, # 2 bytes, unsigned
c: u32, # 4 bytes, unsigned
d: u64, # 8 bytes, unsigned
e: i8, # 1 byte, signed
f: i16, # 2 bytes, signed
g: i32, # 4 bytes, signed
h: i64, # 8 bytes, signed
}For precise endianness control, use the suffixed forms:
packet MixedEndian {
big_val: u32be, # 4 bytes, big-endian
little_val: u32le, # 4 bytes, little-endian
also_big: u16be,
also_le: u16le,
}The u24 / u24be / u24le types encode a 24-bit (3-byte) unsigned integer. This is common in TLS, where message lengths are 3 bytes:
# From examples/tls/tls13.wspec
packet CertificateEntry {
cert_data_length: u24,
cert_data: bytes[cert_data_length],
extensions_length: u16,
extensions: bytes[extensions_length],
}In C: u8 → uint8_t, u16 → uint16_t, u32 → uint32_t, u64 → uint64_t. Signed types map to int8_t, int16_t, etc. u24 maps to uint32_t (the extra byte is masked off). The parse function reads the correct number of bytes and applies any necessary byte-swap.
2. Byte Types
wirespec has four forms for byte arrays, covering fixed-length, length-prefixed, and scope-consuming patterns.
Fixed-length bytes
bytes[N] reads or writes exactly N bytes. The pointer is not copied — wirespec gives you a zero-copy view.
# From examples/net/ethernet.wspec
packet EthernetFrame {
dst_mac: bytes[6],
src_mac: bytes[6],
ether_type: u16,
payload: bytes[remaining],
}# From examples/tls/tls13.wspec — 32-byte TLS random nonce
packet ClientHello {
legacy_version: u16,
random: bytes[32],
...
}Length-prefixed bytes
bytes[length: EXPR] reads a byte count from EXPR (a prior field or expression) and then reads that many bytes.
# From examples/mqtt/mqtt.wspec
packet MqttString {
length: u16,
data: bytes[length],
}The expression can be arithmetic over prior fields:
# From examples/net/udp.wspec
packet UdpDatagram {
src_port: u16,
dst_port: u16,
length: u16,
checksum: u16,
require length >= 8,
data: bytes[length: length - 8], # payload = total - 8-byte header
}Remaining bytes
bytes[remaining] consumes every byte left in the current scope. It must be the last wire field in its scope.
# From examples/ble/att.wspec
frame AttPdu = match opcode: u8 {
0x0b => ReadRsp { value: bytes[remaining] },
0x12 => WriteReq { handle: AttHandle, value: bytes[remaining] },
...
}Optional-length bytes
bytes[length_or_remaining: EXPR] handles the case where a length field is optional. If EXPR is present (Some), that many bytes are read; if absent (null), all remaining bytes in the scope are consumed. EXPR must have type Option[T] where T is an integer type.
# From examples/quic/frames.wspec — Stream frame
0x08..=0x0f => Stream {
stream_id: VarInt,
offset_raw: if frame_type & 0x04 { VarInt },
length_raw: if frame_type & 0x02 { VarInt },
data: bytes[length_or_remaining: length_raw],
...
},In C: All byte variants generate wirespec_bytes_t — a zero-copy struct containing a const uint8_t *ptr and size_t len. No memory is allocated; the pointer points into the original input buffer.
typedef struct { const uint8_t *ptr; size_t len; } wirespec_bytes_t;3. Endianness
Multi-byte integer fields require an explicit endianness. wirespec lets you set a module-level default, then override it per field.
Module-level default
Use @endian big or @endian little at the top of your .wspec file:
# From examples/ble/att.wspec — BLE uses little-endian
@endian little
module ble.att
type AttHandle = u16le
type Uuid16 = u16le# QUIC and most network protocols use big-endian
@endian big
module quic.framesPer-field override
A field using an explicit-endian type (u16le, u32be, etc.) always uses that endianness, regardless of the module default. This lets you mix endianness within a single struct when the wire format demands it.
packet MixedEndian {
net_field: u32, # big-endian (from @endian big)
le_field: u32le, # always little-endian
another_net: u16, # big-endian again
}The priority rule is: explicit per-field type always beats module default.
4. Constants, Enums, and Flags
Constants
const defines a named compile-time value. The type must be a primitive integer type.
# From examples/quic/frames.wspec
const MAX_CID_LENGTH: u8 = 20
packet LengthPrefixedCid {
length: u8,
value: bytes[length],
require length <= MAX_CID_LENGTH,
}Enums
enum associates names with integer values. The underlying type (after :) is the wire type used for the tag field.
# From examples/tls/tls13.wspec
enum ContentType: u8 {
ChangeCipherSpec = 20,
Alert = 21,
Handshake = 22,
ApplicationData = 23,
}
enum HandshakeType: u8 {
ClientHello = 1,
ServerHello = 2,
Certificate = 11,
CertificateVerify = 15,
Finished = 20,
}Enum types can be used directly as match tags in capsules:
# From examples/tls/tls13.wspec
capsule TlsRecord {
content_type: ContentType,
legacy_version: u16,
length: u16,
payload: match content_type within length {
22 => Handshake { data: bytes[remaining] },
21 => Alert { level: u8, description: u8 },
...
},
}Flags (bit masks)
flags is like an enum, but the values are intended as bitmasks that can be OR'd together. The underlying type must be an unsigned integer.
flags PacketFlags: u8 {
KeyPhase = 0x04,
SpinBit = 0x20,
FixedBit = 0x40,
}In C: Both enum and flags generate a C enum typedef. Constants become #define macros.
5. Packets (Structs)
A packet defines a fixed structure: a sequence of named fields parsed in order. Packets are the basic building block for protocol headers.
# From examples/net/udp.wspec
module net.udp
@endian big
packet UdpDatagram {
src_port: u16,
dst_port: u16,
length: u16,
checksum: u16,
require length >= 8,
data: bytes[length: length - 8],
}# From examples/net/tcp.wspec
packet TcpSegment {
src_port: u16,
dst_port: u16,
seq_num: u32,
ack_num: u32,
data_offset: bits[4],
reserved: bits[4],
cwr: bit, ece: bit, urg: bit, ack: bit,
psh: bit, rst: bit, syn: bit, fin: bit,
window: u16,
checksum: u16,
urgent_pointer: u16,
require data_offset >= 5,
options: bytes[length: data_offset * 4 - 20],
}In C: A packet becomes a typedef struct. The compiler generates three functions:
wirespec_result_t udp_datagram_parse(
const uint8_t *buf, size_t len,
udp_datagram_t *out, size_t *consumed);
wirespec_result_t udp_datagram_serialize(
const udp_datagram_t *in,
uint8_t *buf, size_t cap, size_t *written);
size_t udp_datagram_serialized_len(const udp_datagram_t *in);Structs contain only stack-allocated values — wirespec never calls malloc.
6. Frames (Tagged Unions)
A frame is a tagged union: the first field is the discriminator, and the body depends on its value. Use frame when you have a single opcode/type byte that selects among many possible layouts.
# From examples/quic/frames.wspec (abbreviated)
frame QuicFrame = match frame_type: VarInt {
0x00 => Padding {},
0x01 => Ping {},
0x06 => Crypto {
offset: VarInt,
length: VarInt,
data: bytes[length],
},
0x08..=0x0f => Stream {
stream_id: VarInt,
offset_raw: if frame_type & 0x04 { VarInt },
length_raw: if frame_type & 0x02 { VarInt },
data: bytes[length_or_remaining: length_raw],
let offset: u64 = offset_raw ?? 0,
let fin: bool = (frame_type & 0x01) != 0,
},
_ => Unknown { data: bytes[remaining] },
}Pattern ranges
Patterns support inclusive ranges with ..=:
0x02..=0x03matches tag values 2 or 30x08..=0x0fmatches values 8 through 15 inclusive_is the catch-all wildcard (required for exhaustiveness)
Specific values always beat ranges; ranges always beat _.
Accessing the tag in branch fields
The tag field (frame_type above) is visible inside every branch. This makes it possible to use its exact value for conditional sub-fields, as in if frame_type == 0x03 { EcnCounts }.
In C: A frame becomes a C union wrapped in a struct with a tag enum field. Each variant is a separate nested struct. The parse function reads the tag, dispatches to the right branch, and fills in the variant fields.
7. Capsules (TLV with within)
A capsule is a Type-Length-Value container. It has a fixed header (including a length field), then a payload that is parsed within a sub-scope bounded by that length. This ensures that a malformed or unrecognized payload cannot read past its allocated bytes.
# From examples/mqtt/mqtt.wspec
capsule MqttPacket {
type_and_flags: u8,
remaining_length: MqttLength,
payload: match (type_and_flags >> 4) within remaining_length {
1 => Connect {
protocol_name: MqttString,
protocol_level: u8,
connect_flags: u8,
keep_alive: u16,
client_id: MqttString,
will_topic: if connect_flags & 0x04 { MqttString },
will_message: if connect_flags & 0x04 { MqttBytes },
username: if connect_flags & 0x80 { MqttString },
password: if connect_flags & 0x40 { MqttBytes },
},
3 => Publish {
topic: MqttString,
let qos: u8 = (type_and_flags & 0x06) >> 1,
packet_id: if qos > 0 { u16 },
payload: bytes[remaining],
},
_ => Unknown { data: bytes[remaining] },
},
}The within remaining_length clause creates a sub-cursor capped at remaining_length bytes. Inside the sub-scope:
- Under-reading (payload did not consume all bytes) →
WIRESPEC_ERR_TRAILING_DATA - Over-reading (payload tried to read past the length) →
WIRESPEC_ERR_SHORT_BUFFER
Expression tags
The tag in a capsule (or frame) can be any expression over header fields, not just a plain field name:
payload: match (type_and_flags >> 4) within remaining_length { ... }Here (type_and_flags >> 4) extracts the packet type from the high nibble of the flags byte. All fields referenced in the tag expression must be declared before the within clause.
In C: A capsule generates the same struct/union pattern as a frame, plus a sub-cursor that enforces the length boundary.
TLS Extension example
# From examples/tls/tls13.wspec
capsule Extension {
extension_type: u16,
length: u16,
payload: match extension_type within length {
0x002b => SupportedVersions { data: bytes[remaining] },
0x000d => SignatureAlgorithms { data: bytes[remaining] },
0x0033 => KeyShare { data: bytes[remaining] },
0x0000 => ServerName { data: bytes[remaining] },
_ => Unknown { data: bytes[remaining] },
},
}8. Expression Language
Expressions appear in field conditions, derived fields, constraints, and array sizes. The full precedence table, from lowest to highest binding:
| Level | Operators | Notes |
|---|---|---|
| coalesce | ?? | Option[T] ?? T — unwrap with default |
| logical or | or | |
| logical and | and | |
| comparison | == != < <= > >= | |
| bitwise or | | | |
| bitwise xor | ^ | |
| bitwise and | & | Binds tighter than comparison |
| shift | << >> | |
| additive | + - | |
| multiplicative | * / % | |
| unary | ! - | |
| postfix | .field, [idx], [start..end] |
Important: In wirespec, bitwise operators bind tighter than comparison operators — the opposite of C. This means a & mask == 0 is always (a & mask) == 0, which matches programmer intent. In C the same expression is a & (mask == 0), a notorious bug source.
The ?? coalesce operator
?? unwraps an optional value with a default:
0x08..=0x0f => Stream {
offset_raw: if frame_type & 0x04 { VarInt },
...
let offset: u64 = offset_raw ?? 0,
}If offset_raw is present (the 0x04 bit is set), offset_raw ?? 0 returns its value. If absent, it returns 0.
Arithmetic on field values
Expressions over prior fields can be used wherever a number is expected:
data: bytes[length: length - 8], # subtract header overhead
cipher_suites: [u16; cipher_suites_length / 2], # count in elements, not bytes
options: bytes[length: data_offset * 4 - 20], # multiply and subtract9. Optional Fields
An optional field uses if COND { T }. The field is present on the wire only when COND is truthy. wirespec tracks its presence as Option[T] in the type system.
# From examples/quic/frames.wspec — ECN counts only in ACK type 0x03
0x02..=0x03 => Ack {
largest_ack: VarInt,
ack_range_count: VarInt,
first_ack_range: VarInt,
ack_ranges: [AckRange; ack_range_count],
ecn_counts: if frame_type == 0x03 { EcnCounts },
},# From examples/mqtt/mqtt.wspec — QoS-dependent packet ID
3 => Publish {
topic: MqttString,
let qos: u8 = (type_and_flags & 0x06) >> 1,
packet_id: if qos > 0 { u16 },
payload: bytes[remaining],
},The condition can test any prior field, including bit-mask tests:
# Present when a specific flag bit is set
offset_raw: if frame_type & 0x04 { VarInt },In C: An optional field becomes a presence flag and the value:
bool has_ecn_counts;
ecn_counts_t ecn_counts;To use an optional field's value in an expression, you must either guard it with the same condition or use ?? to provide a default.
10. Derived Fields (let)
A let field computes a value from other fields. It is not present on the wire — it does not consume any bytes — but it is available in the C struct and can be used in subsequent field expressions and constraints.
# From examples/quic/frames.wspec
0x08..=0x0f => Stream {
stream_id: VarInt,
offset_raw: if frame_type & 0x04 { VarInt },
length_raw: if frame_type & 0x02 { VarInt },
data: bytes[length_or_remaining: length_raw],
let offset: u64 = offset_raw ?? 0,
let fin: bool = (frame_type & 0x01) != 0,
},# From examples/mqtt/mqtt.wspec
let qos: u8 = (type_and_flags & 0x06) >> 1,bool is a semantic type — it cannot be a wire field, but it can be a let target. A derived field can reference any wire field declared above it and any prior let field.
In C: A let field becomes a regular struct member that is assigned during parsing. The serialize function recomputes the expression on the fly (it does not use the stored value).
11. Constraints (require and static_assert)
Runtime constraints with require
require EXPR adds a runtime check. If the expression is false when that point is reached during parsing, the parser returns WIRESPEC_ERR_CONSTRAINT.
# From examples/net/udp.wspec
require length >= 8,
# From examples/quic/frames.wspec
require length <= MAX_CID_LENGTH,
# From examples/net/tcp.wspec
require data_offset >= 5,
# From examples/mqtt/mqtt.wspec — check flag bits
require type_and_flags & 0x0F == 0x02,A require clause can appear anywhere among the fields in a packet, frame branch, or capsule body. It can reference any wire field or let field declared above it.
Compile-time assertions with static_assert
static_assert EXPR is checked at compile time. It is useful for asserting relationships between constants:
const MAX_CID_LENGTH: u8 = 20
static_assert MAX_CID_LENGTH <= 255If the assertion fails, the compiler reports an error before generating any code.
12. Arrays
Count-bound arrays
[T; count] reads exactly count elements of type T. The count comes from a prior integer field:
# From examples/quic/frames.wspec
ack_ranges: [AckRange; ack_range_count],
# From examples/tls/tls13.wspec — count derived from byte length / element size
cipher_suites: [u16; cipher_suites_length / 2],Fill arrays
[T; fill] consumes the rest of the current scope, reading as many T as fit:
# A list that fills the rest of the packet
entries: [Entry; fill],Fill within a length bound
[T; fill] within EXPR reads elements until exactly EXPR bytes have been consumed. This is the typical pattern for TLS-style extension lists:
# From examples/tls/tls13.wspec
extensions: [Extension; fill] within extensions_length,
# From examples/tls/tls13.wspec
certificate_list: [CertificateEntry; fill] within certificate_list_length,This creates a sub-scope of extensions_length bytes and reads Extension records until the sub-scope is exhausted.
Array capacity
In C: Arrays are allocated at fixed capacity on the stack. The default is WIRESPEC_MAX_ARRAY_ELEMENTS (64 elements), defined in wirespec_runtime.h. You can override it globally with -DWIRESPEC_MAX_ARRAY_ELEMENTS=N at compile time.
To override capacity for a specific field, use the @max_len annotation:
packet Foo {
count: u16,
items: [Item; count], # uses default capacity (64)
@max_len(1024)
large_items: [Item; count], # uses capacity 1024
}If the actual count on the wire exceeds the capacity, the parser returns WIRESPEC_ERR_CAPACITY.
The generated C struct looks like:
typedef struct {
uint16_t count;
item_t items[64];
item_t large_items[1024];
} foo_t;13. Type Aliases and Computed Types
Simple aliases
type Name = TypeExpr creates an alias. The alias introduces no new wire layout — it is purely a naming convenience.
# From examples/ble/att.wspec
type AttHandle = u16le
type Uuid16 = u16leComputed types (dependent records)
When the right-hand side is a { ... } block, the type is a computed type: a record whose field types depend on the values of earlier fields in the same record.
# From examples/quic/varint.wspec — QUIC Variable-Length Integer
@strict
type VarInt = {
prefix: bits[2],
value: match prefix {
0b00 => bits[6],
0b01 => bits[14],
0b10 => bits[30],
0b11 => bits[62],
},
}The prefix field is read first (2 bits), then value reads a variable number of bits based on the prefix value. Computed types are ordinary types and can be used anywhere a type is expected.
The @strict annotation (shown on VarInt) rejects non-canonical encodings at parse time — for example, encoding the value 1 with a 2-byte form when a 1-byte form exists. Returns WIRESPEC_ERR_NONCANONICAL.
14. Continuation-Bit VarInt
Some protocols use a different variable-length integer scheme where each byte carries a continuation flag in its most significant bit (or least significant bit), with the remaining bits carrying data. MQTT, Protocol Buffers, and LEB128 all use this pattern.
# From examples/mqtt/mqtt.wspec — MQTT Remaining Length
type MqttLength = varint {
continuation_bit: msb, # MSB=1 means more bytes follow
value_bits: 7, # 7 data bits per byte
max_bytes: 4, # maximum encoded length: 4 bytes (268,435,455)
byte_order: little, # lower-order groups come first
}The varint keyword introduces a continuation-bit VarInt type. The parameters are:
| Parameter | Values | Meaning |
|---|---|---|
continuation_bit | msb or lsb | Which bit position holds the continuation flag |
value_bits | integer | Data bits per byte |
max_bytes | integer | Maximum encoded bytes before overflow |
byte_order | little or big | Whether lower-order groups appear first |
If the continuation bit is still set after max_bytes bytes have been read, the parser returns WIRESPEC_ERR_OVERFLOW.
In C: A continuation-bit VarInt is decoded into the smallest uintN_t that fits the maximum value. For MqttLength (max 4 bytes, 28 effective bits), the C field type is uint32_t.
15. bits[N] and BitGroup Packing
bits[N] reads exactly N bits as an unsigned integer. bit is shorthand for bits[1]. These are used for protocol headers that pack multiple small fields into a single byte or word.
# From examples/ip/ipv4.wspec
packet IPv4Header {
version: bits[4],
ihl: bits[4],
dscp: bits[6],
ecn: bits[2],
total_length: u16,
identification: u16,
flags: bits[3],
fragment_offset: bits[13],
ttl: u8,
protocol: u8,
...
}# From examples/net/tcp.wspec
packet TcpSegment {
...
data_offset: bits[4],
reserved: bits[4],
cwr: bit,
ece: bit,
urg: bit,
ack: bit,
psh: bit,
rst: bit,
syn: bit,
fin: bit,
window: u16,
...
}BitGroup auto-grouping
Consecutive bits[N] and bit fields are automatically grouped into a single read. The compiler:
- Sums the bit widths to determine how many bytes to read (must be whole bytes).
- Reads those bytes in a single operation.
- Extracts each field with shift + mask operations.
For big-endian packing, the first declared field occupies the most significant bits. For little-endian packing (@endian little), the first declared field occupies the least significant bits.
In C: Each bits field becomes a uint8_t or uint16_t or uint32_t (smallest type that fits). The parse function uses shift and mask:
uint8_t _byte0 = buf[pos];
out->version = (_byte0 >> 4) & 0x0f;
out->ihl = (_byte0 >> 0) & 0x0f;The constraint that bit groups must sum to whole bytes is checked at compile time.
16. @checksum Annotation
The @checksum annotation on a field instructs the compiler to add automatic checksum verification (on parse) and automatic checksum computation (on serialize).
# From examples/ip/ipv4.wspec
packet IPv4Header {
version: bits[4],
ihl: bits[4],
...
@checksum(internet)
header_checksum: u16,
src_addr: u32,
dst_addr: u32,
}On parse: The runtime computes the checksum over the entire struct's bytes and verifies it. If the check fails, WIRESPEC_ERR_CHECKSUM is returned.
On serialize: The checksum field is zeroed, the checksum is computed over the serialized bytes, and the result is written back into the field.
Supported algorithms
| Algorithm | Field type | Standard |
|---|---|---|
internet | u16 | RFC 1071 one's complement sum (IPv4, UDP, TCP) |
crc32 | u32 | IEEE 802.3 CRC-32 |
crc32c | u32 | CRC-32C (Castagnoli), used in SCTP, iSCSI |
fletcher16 | u16 | RFC 1146 Fletcher-16 |
# From examples/checksum/crc32_test.wspec
packet Crc32Packet {
id: u16,
length: u16,
require length >= 8,
data: bytes[length: length - 8],
@checksum(crc32)
checksum: u32,
}There can be at most one @checksum annotation per packet or frame branch. The field type must match what the algorithm requires.
Putting It All Together
Here is a complete, non-trivial example that uses most of the features above — the full QUIC frames definition from examples/quic/frames.wspec:
module quic.frames
@endian big
import quic.varint.VarInt
const MAX_CID_LENGTH: u8 = 20
packet AckRange { gap: VarInt, ack_range: VarInt }
packet EcnCounts { ect0: VarInt, ect1: VarInt, ecn_ce: VarInt }
packet LengthPrefixedCid {
length: u8,
value: bytes[length],
require length <= MAX_CID_LENGTH,
}
frame QuicFrame = match frame_type: VarInt {
0x00 => Padding {},
0x01 => Ping {},
0x02..=0x03 => Ack {
largest_ack: VarInt,
ack_delay: VarInt,
ack_range_count: VarInt,
first_ack_range: VarInt,
ack_ranges: [AckRange; ack_range_count],
ecn_counts: if frame_type == 0x03 { EcnCounts },
},
0x06 => Crypto {
offset: VarInt,
length: VarInt,
data: bytes[length],
},
0x08..=0x0f => Stream {
stream_id: VarInt,
offset_raw: if frame_type & 0x04 { VarInt },
length_raw: if frame_type & 0x02 { VarInt },
data: bytes[length_or_remaining: length_raw],
let offset: u64 = offset_raw ?? 0,
let fin: bool = (frame_type & 0x01) != 0,
},
0x18 => NewConnectionId {
sequence: VarInt,
retire_prior: VarInt,
cid_length: u8,
cid: bytes[cid_length],
reset_token: bytes[16],
},
0x1c..=0x1d => ConnectionClose {
error_code: VarInt,
offending_frame_type: if frame_type == 0x1c { VarInt },
reason_length: VarInt,
reason_phrase: bytes[reason_length],
},
0x30..=0x31 => Datagram {
length: if frame_type & 0x01 { VarInt },
data: bytes[length_or_remaining: length],
},
_ => Unknown { data: bytes[remaining] },
}This single definition exercises: imports, constants, packets, frames, pattern ranges, optional fields, derived fields, arrays, bytes[length], bytes[remaining], bytes[length_or_remaining], bit-mask conditions, the ?? operator, and the require constraint.
Next Steps
- Modules and imports — splitting definitions across multiple
.wspecfiles: Modules - State machines — protocol session logic with typed transitions: State Machines
- Reference — complete grammar and generated C API: Reference