Soup Cans

A .can is a portable, verifiable recipe bundle — tar.gz of a manifest, the soup.yaml, and a data_ref. v0.26.0 ships soup can pack / inspect / verify / fork for sharing and forking fine-tunes without shipping checkpoints or training data directly.

Pack

bash
soup can pack \
  --entry-id chat-llama@v1 \
  --out chat-llama-v1.can

The pack includes:

  • manifest.json — name, tag, parent, created-at, schema version
  • soup.yaml — exact training config
  • data_ref.json — a hash + URI pointing at the dataset (not the dataset itself)

Inspect

bash
soup can inspect chat-llama-v1.can

Prints manifest + config without extracting the archive.

Verify

bash
soup can verify chat-llama-v1.can

Checks schema version, manifest integrity, and whether the embedded soup.yaml still parses against the current Pydantic schema.

Fork and modify

bash
soup can fork chat-llama-v1.can \
  --out chat-llama-v2.can \
  --modify training.lr=3e-4 \
  --modify training.lora.r=32

Fork re-packs the archive with your overrides and a fresh manifest (new id, parent = original).

Safety

Tar extraction is the classic attack surface. Soup Cans neutralize it:

  • Rejects absolute paths and .. traversal
  • Rejects symlinks that point outside the extraction root
  • 100 MB cap on archive size
  • Format version locked to 1 — mismatched archives are refused
  • dunder keys (__class__, __import__, etc.) and null bytes are rejected in fork --modify
  • Path containment uses shared os.path.realpath + commonpath so Windows short-name / junction tricks don't escape

See also

  • [Model registry](/docs/registry) — the source of truth that Cans export from
  • [Configuration](/docs/configuration) — what lives inside a .can