Pre-flight & Tooling (v0.64.0)
Six surfaces that catch mistakes before you spend a GPU hour: pick the right base, lock the run plan, freeze the environment, install completions, advise on licenses, and predict peak VRAM.
`soup tunability` — Pareto frontier of base-model efficiency
soup tunability --dataset ./chats.jsonl --candidates llama-3.1-8b qwen2.5-7b gemma-3-9b \
--probe-steps 100 --holdout-size 64 --output ./tunability.jsonProbes a held-out dataset slice against each candidate base with a lightweight LoRA, measures training-loss deltas, and reports which bases form the Pareto frontier (best efficiency for cost). --plan-only dry-runs without probing; --list shows all bundled candidates.
- Candidate allowlist with licensing metadata (Apache-2.0 / MIT / LLaMA-3 / etc.)
- Bounds:
probe_steps ∈ [10, 10000],holdout_size ∈ [10, 100000]rows - Safety: path containment, null-byte rejection, symlink-escape rejection
- Output: per-candidate delta, wall-clock seconds, estimated USD cost, Pareto membership
`soup plan` / `soup apply` — Terraform-shaped drift detection
soup plan --config soup.yaml --state ./soup.tfstate
soup apply --config soup.yaml --state ./soup.tfstateplan computes cost / ETA / peak-VRAM / SHA-256 hashes from the config and writes an immutable soup.tfstate. apply re-reads the config, detects any drift (batch size, dataset SHA, base SHA) and refuses to proceed (exit 3) until you re-plan.
- Pure JSON state:
plan{cost, eta, sha},applied: bool,applied_at,run_id - TOCTOU defense:
os.lstatbefore open, symlinks rejected - Peak VRAM with 10% safety margin; spot pricing per GPU tier
- Composes with v0.67
soup lockfor full reproducibility
`soup env lock` / `status` / `check` — hermetic environment
soup env lock --output ./soup-env.lock
soup env check --lock ./soup-env.lock # exit 3 on ABI driftSnapshots Python version, CUDA major version, platform, and every installed package into a JSON lockfile. check detects ABI-sensitive drift (e.g. CUDA 12 → 13) that would silently break training.
- Fields:
soup_version,python_version,platform,cuda_version, packages{name, version, source} - Atomic write, file-size capped
- Feeds the
env_hashhalf of v0.67soup.lockclosure
`soup completions <shell>` — bash / zsh / fish
soup completions bash | sudo tee /etc/bash_completion.d/soup
soup completions zsh > ~/.zsh/completions/_soup
soup completions fish > ~/.config/fish/completions/soup.fishEval-safe shell completion scripts emitted to stdout (no Rich panels). Closed shell allowlist; all error messages go to stderr.
`soup license-advisor` — deploy-target risk gate
soup license-advisor --target b2c --license llama-3 --monthly-active-users 750000Returns ok / warn / block (exit 3 on block) for a (license, deploy-target, MAU) tuple. Targets: b2c, defense, embedded — each with distinct rules. Composes with v0.60 license-matrix on soup adapters merge.
Hardware-fit calculator — analytical peak VRAM
from soup_cli.utils.hardware_fit import estimate_peak_vram_gb, decide_hardware_fit
report = decide_hardware_fit(input, available_vram_gb=24.0)
# report.predicted_peak_gb -> 18.2
# report.breakdown -> {weights: 4.1, optimizer: 8.2, gradients: 4.1, activations: 1.3, overhead: 0.5}Static predictor with a 5-bucket breakdown (weights / optimizer / gradients / activations / overhead) and a 10% safety margin. Refuses to run if predicted peak exceeds available VRAM.
- 9 quant tiers (
none,4bit,8bit,fp8,gptq,awq,aqlm,eetq,mxfp4) - 4 PEFT modes (
full,lora,dora,qlora) - Bounds:
seq_len ∈ [64, 1M],batch ∈ [1, 1024],params ∈ (0, 1000B] - Activation memory halved under gradient checkpointing
Numbers
Six surfaces, +N tests on top of v0.63's 10,035. Composes downstream with v0.65 eval depth, v0.66 post-train x-rays, and v0.67 adapter lifecycle.
See also
- [Soup lock](/docs/adapter-lifecycle) — v0.67 closes the env_hash → soup.lock reproducibility chain.
- [Governance](/docs/governance) — v0.59 BOM + SLSA-3 + audit log layer on top of plan/apply.