Reproducible builds

Reproducible builds

boxes and glue honours the SOURCE_DATE_EPOCH convention at the library level. The hook lives in frontend.New, so every caller — bagme, glu, ad-hoc Go programs — picks it up automatically. No flag, no extra code in the calling program.

How it works

When the environment variable SOURCE_DATE_EPOCH holds a non-negative integer (Unix seconds), frontend.New sets the document’s CreationDate to that timestamp and turns on SuppressInfo. The side-effect chain is:

Source of variance Behaviour
InfoDict CreationDate (would be time.Now()) Set from the env value.
InfoDict ModDate (PDF/X mode) Set from the env value.
XMP xmp:CreateDate / xmp:ModifyDate / xmp:MetadataDate Set from the env value.
XMP xmpMM:DocumentID / xmpMM:InstanceID UUIDs Replaced with stable hardcoded values via SuppressInfo.
Trailer /ID An MD5 of the cross-reference byte content, so it stabilises once the three above stabilise.
SOURCE_DATE_EPOCH=1700000000 go run main.go

The resulting PDF is byte-identical across runs with the same input.

Programmatic override

Callers that need explicit control (build pipelines without env vars, tests pinning to a known date) can assign after frontend.New:

fe, _ := frontend.New("out.pdf")
fe.Doc.CreationDate = time.Date(2026, 1, 1, 0, 0, 0, 0, time.UTC)
fe.Doc.SuppressInfo = true

The explicit assignment wins over the env-variable value — useful when SOURCE_DATE_EPOCH is set globally in the shell but a specific document needs a different timestamp.

What’s not deterministic without this hook

Without SOURCE_DATE_EPOCH, every render produces different PDFs even for identical input — the InfoDict and XMP date fields plus the two XMP UUIDs change on every call. The trailer /ID then changes too, because it’s derived from the same bytes. So byte-identical regression tests require either the env variable or the programmatic override.

The reference test suite (rake check_*_examples in the monorepo) relies on this contract: every example is rendered with SOURCE_DATE_EPOCH=0 and md5-compared against a checked-in result.pdf. A failure usually means a code change introduced a new source of non-determinism, not a layout regression.