OCI Bundles
An OCI bundle packages the datamitsu tool store as a standard OCI image: one layer per store subtree (a binary, a runtime, a runtime-managed app), annotated so datamitsu can pull exactly the pieces it needs — without docker or podman. The bundle is a cache accelerator and an airgap seed, not a replacement for resolution: whatever is in the bundle is taken from it, whatever is not gets downloaded the usual way.
Declaring a bundle
The config gains a top-level oci key:
function getConfig(input) {
return {
...input,
oci: {
ref: "ghcr.io/owner/tool-store",
digest: "sha256:6c3c624b58dbbcd3c0dd82b4c53f04194d1247c6eebdaab7c610cf7d66709b3b",
},
};
}
The digest is mandatory — a tag never pins content. The declaration chains through config layers as a scalar (last writer wins, {...input} inherits), so a wrapper config can ship it and a project config keeps it automatically.
How seeding works
Two paths use the same machinery:
- Auto-seed (demand-driven). Before
check/fix/lintpre-install,install, andinit, datamitsu computes the store paths the current operation needs (tools plus their runtime dependencies — the runtime of a runtime app, the shared CPython for uv apps, the pnpm runtime for node apps) and pulls only those layers. A bundle of 50 tools costs a project that needs 3 of them one cached manifest GET plus 3–5 blob downloads. If everything is already in the store, no network request is made at all. datamitsu store seed(full pull). Pulls every annotated layer — the airgap workflow. A completed full pull writes a marker inside the store, so repeating it is a no-op;store clearremoves the marker together with the content.
Multi-platform bundles are a single OCI index. os/arch are matched via the standard platform fields; libc (glibc vs musl) via the com.datamitsu.libc descriptor annotation inside the digest-verified index bytes. When libc detection fails (e.g. distroless hosts), datamitsu refuses to guess — set DATAMITSU_LIBC=glibc or DATAMITSU_LIBC=musl.
A bundle entry missing for your platform is a degradation, not a failure: datamitsu warns and falls back to direct downloads.
Slow links and retries
Blob downloads have no overall timeout — a 400 MiB layer on a 1 Mbps VPN is a healthy download that simply takes a while. Instead, each attempt is watched for progress: if no data arrives for 2 minutes, the attempt is aborted with a clear stalled: no data received error and retried (up to 4 attempts with exponential backoff). Registry metadata requests (manifests, auth token handshake) carry small bodies and keep a flat 120-second deadline.
The store commands also work without a usable git context (no git binary, dubious ownership errors in containers): they operate on the global store, so a broken project repo only skips the project-level config with a warning instead of failing the command.
Trust model
The bundle is not a trust boundary by itself:
- Every manifest body and blob is verified against its SHA-256 descriptor before extraction — not a single unverified byte enters the chain. Trusting
oci.digestis equivalent to trusting the config source that declares it (same as the per-binaryhashfields today). - Single-file binaries and JVM jars are re-hashed after extraction against the published SHA-256 from the config — a bundle whose content was swapped relative to the config fails hard.
- Runtime app directories (uv/node/go) have no published content hash (they are built, not downloaded); their integrity rests on the digest chain plus the mandatory lockfiles.
- Each layer may only write into the single store subtree it declares (
com.datamitsu.subtree); content outside it — including hardlinks pointing elsewhere — fails the pull loudly. oci.signerwill pin the publisher identity via sigstore verification at pull time (planned; a setsignercurrently fails the seed rather than silently skipping the check).
Offline mode
DATAMITSU_OFFLINE is a full "don't touch the network" switch, orthogonal to bundles: it never auto-pulls; the store must be seeded beforehand — store seed while online, store import from an OCI layout directory, or a volume mount. With a seeded store, tools resolve with zero network; a miss fails with a clear message instead of a hanging download.
# online machine
datamitsu store seed
# air-gapped machine (same store, e.g. copied or mounted)
DATAMITSU_OFFLINE=1 datamitsu check
Both the offline switch and the seeding state are introspectable:
datamitsu config runtime | jq '.offline, .noOci, .libc'
datamitsu store status
Kill switches
--no-oci(any command) orDATAMITSU_NO_OCI=1— disable bundle seeding entirely; tools download directly as before.- Bundles change where bytes come from, never which versions run: tool resolution and cache keys are identical with and without a bundle.
Producing a bundle
Bundle production deliberately reuses the community toolchain instead of reimplementing a registry client push:
datamitsu devtools dockerfile --emit-oci-map map.jsongenerates the multi-stage Dockerfile (whose final stage already emits oneCOPY --linklayer per store subtree) plus the layer→subtree map.docker buildxbuilds and pushes the image(s) per platform/libc.- A CI post-process (regctl/crane) writes the
com.datamitsu.subtreelayer annotations and thecom.datamitsu.store-rootmanifest annotation frommap.json, assembles the bundle index with libc descriptor annotations, tags it (untagged indexes are vulnerable to registry cleanup), and optionally signs it with cosign.
A mapping mistake in step 3 is not a security hole: the consumer's per-subtree write-allowlist validates layer content against the declared subtree and fails loudly at pull time.