Governance & Provenance (v0.59.0)

Procurement-ready ML compliance from a single CLI. v0.59 ships 4 governance surfaces that previously required a stack of SaaS tools and a security team: CycloneDX/SPDX BOM emitter, in-toto / SLSA-3 attestation, HIPAA / SOC2 audit log, and EU AI Act Annex XI/XII auto-documentation.

`soup bom emit` — ML Bill of Materials

Generates machine-learning Bills of Material in CycloneDX 1.6 (with ML-BOM extension) or SPDX 2.3 AI-profile formats — or both in a single invocation.

bash
soup bom emit \
  --name llama3-8b-finetuned --version 1.0.0 \
  --base-model meta-llama/Llama-3.1-8B-Instruct \
  --base-sha abc123...def456 \
  --config-sha 789def...012abc \
  --data-sha 456ghi...789jkl \
  --task sft --license apache-2.0 \
  --format both --output ./manifests/llama3-bom

Atomic file write (tempfile.mkstemp + os.replace) with symlink rejection (TOCTOU defense). --format=both produces <prefix>.cdx.json and <prefix>.spdx.json side by side.

`soup attest emit` — SLSA-3 in-toto attestations

Per-stage attestation aligned with SLSA-3 (Supply-chain Levels for Software Artifacts) and in-toto.

bash
soup attest emit --stage train \
  --subject adapter.safetensors --sha abc123...xyz789 \
  --builder soup-cli \
  --invocation "soup train --config soup.yaml" \
  --sign unsigned --output ./attestations/train.json

Stages: extract / train / eval / export / publish. Backends: unsigned (v0.59.0 default — tamper-detectable via SHA-256), ed25519 and sigstore ship in v0.59.1.

`soup audit-log` — HIPAA/SOC2 audit trail

Every command execution records timestamp, command-line, exit code, operator identity, and host into ~/.soup/audit.jsonl (or $SOUP_AUDIT_LOG_PATH). PII fields are redacted before write.

bash
# Tail the most recent 100 records (rich table)
soup audit-log tail --limit 100

# Raw JSONL for piping
soup audit-log tail --limit 50 --json

# Rotate at a 500 MB cap
soup audit-log rotate --cap-mb 500

EU AI Act Annex XI/XII

soup train ships two new flags that emit the documentation required by the EU AI Act:

bash
soup train --config soup.yaml \
  --annex-xi ./docs/annex-xi.md \
  --repro-receipt ./receipts/repro.json

The reproducibility receipt captures every seed, kernel version, library version, and dataset hash needed to reproduce the run under SR 11-7 model-risk-management standards.

CO₂ energy tracking schema

soup.yaml accepts an optional co2 block tying training energy to electricityMap intensity data so the BOM and Annex XI doc carry a real-time gCO₂eq number. The estimator backend lands in v0.59.1.

Numbers

+93 new tests in v0.59.0 (9193 → 9286).

See also

  • [Supply-chain security](/docs/supply-chain-security) — v0.60 LoRA backdoor scanner, Merkle signing, air-gap bundles.
  • [Registry](/docs/registry) — every BOM and attestation can be attached as an artifact.
  • [Pre-flight & tooling (v0.64)](/docs/preflight-tooling) — soup license-advisor --target b2c|defense|embedded returns ok/warn/block per (license, deploy-target, MAU) and composes with the v0.59 license-matrix on soup adapters merge.
  • [Adapter lifecycle (v0.67)](/docs/adapter-lifecycle) — soup lock SHA256(base \|\| dataset \|\| env) closure makes governance artifacts reproducible across teams.