Bit-Packing

Generated types are Julia primitive type values: fixed-size bags of bits with no heap allocation. Every field occupies a contiguous bit range within this representation.

Layout

Fields are packed MSB-first in pattern order, and the total bit count is rounded up to the nearest multiple of 8.

Pattern: ("PFX-", :major(digits(max=99)), ".", :minor(digits(max=99)))

  major: ⌈log₂(100)⌉ = 7 bits
  minor: ⌈log₂(100)⌉ = 7 bits
  total: 14 → 16 bits (2 bytes)

  [major: 7][minor: 7][padding: 2]   (MSB → LSB)

Padding bits are always zero. This is important because equality and hashing operate on the full bit representation, so non-zero padding would break them.

The MSB-first layout also means that ult_int (unsigned comparison on the raw bits) gives lexicographic ordering by fields in pattern order — no field extraction or multi-key comparison is needed.

Packing and extraction

During parsing, a parsed accumulator starts at zero. Each field is widened, shifted to its position, and OR-ed in:

parsed |= widen(value) << (type_bits - field_end_bit)

Extraction reverses this with a right shift and mask:

value = (val >> (type_bits - field_end_bit)) & ((1 << field_width) - 1)

All shifts and masks are compile-time constants. The actual operations use Core.Intrinsics (zext_int, trunc_int, bitcast) since the packed type is not a standard integer (see casting sentinels in the code generation overview).

Field encoding

The encoding strategy depends on the field type. Digit fields store an integer value; character fields store each character as an index. Both need to handle optional fields and value ranges compactly.

Digit fields

A required field with min=0 simply stores the raw value. When a field has a nonzero minimum or is optional, one of three strategies is chosen:

-Direct value :: when ⌈{}log<sub>2</sub>(max - min + 1 + opt)⌉{} = ⌈{}log<sub>2</sub>(max + 1)⌉{} (where opt is 1 if optional, 0 otherwise), the raw value already fits without any arithmetic. This is the common case for required fields with min=0. -Offset encoding :: we shift the stored value so that the minimum maps to the lowest nonzero encoding. Optional fields with min=0 store v + 1 (with 0 = absent); required fields with min > 0 store v - min; optional fields with min > 0 store v - (min - 1) (with 0 = absent). This costs one addition or subtraction at both pack and extract time. -Presence bit :: when neither of the above saves bits, a separate 1-bit flag is packed alongside the value.

Character fields

letters, alphnum, hex, and charset fields pack each character into ⌈{}log<sub>2</sub>(alphabet size)⌉{} bits. For fixed-length optional character fields, if adding one extra value (for "absent") does not increase the bits per character, indices start at 1 so that an all-zero packed value means absent, avoiding the need for a separate presence bit. Variable-length character fields store both the packed characters and a length, with the length encoding following the same direct-vs-offset logic described above for digit fields.