Adapter Lifecycle Finish (v0.67.0)

Six surfaces that complete what v0.57 soup adapters started. Adapters are now first-class versioned, collaborative, multi-tenant, evolvable, lockfile-tracked, bisect-able artifacts. None of these exist in hosted vendors — Sakana-style evolutionary merge is research demo only, VeRA storage hurts hosted unit economics (price by GPU-hour, not adapter count), MoLE routing needs both training + serving stacks, adapter PRs need weights + eval + history together.

CMA-ES merge — evolutionary search over LoRA weights

bash
soup adapters merge --strategy cmaes \
  --adapter ./lora_a --adapter ./lora_b --adapter ./lora_c \
  --eval ./suite.yaml --budget 1h --output ./merged

Sakana-style evolutionary merge. Pure-Python rank-mu CMA-ES (no cma dependency). Softmaxes N-1 logits onto the simplex, samples a population, keeps the elite half, plateau-detects after 3 generations without improvement (converged=True).

  • 2..16 adapters; population [2, 256]; generations [1, 10K]
  • Budget [60s, 24h] — reuses v0.57 blame.parse_budget
  • Eval-fn failures swallowed with sentinel -1e9 score (one broken eval ≠ crashed run)
  • Live eval-suite auto-wiring lands in v0.67.1; v0.67.0 prints the validated plan

VeRA / VB-LoRA vector bank — multi-tenant adapter economics

python
from soup_cli.utils.vector_bank import VectorBank, write_bank, estimate_bank_size

bank = VectorBank(
    name="customer-personalisation",
    base_model="meta-llama/Llama-3-8B",
    entries={"user_1": (0.31, -0.07, ...), "user_2": (...)},
)
write_bank(bank, "./bank.json")
# 128-D scaling vector at fp32 ≈ 512 bytes / user
# vs. ~30 MB per rank-16 LoRA on Llama-3-8B

Shared random projection P (d_model × d_model) + per-user scaling vector v_u. Thousands of per-user adapters at MB-each instead of hundreds-of-MB per LoRA. Atomic JSON I/O + cwd containment + symlink rejection + 16 MiB cap.

estimate_bank_size(num_users, vector_dim) for sizing. Live multi-tenant serving via v0.22 multi-adapter surface lands in v0.67.1.

MoLE — per-token gating over task LoRAs

yaml
# soup.yaml
task: moe_lora_routing
training:
  mole:
    num_task_adapters: 8     # [2, 64]
    hidden_dim: 4096
    temperature: 1.0          # softmax sharpness
    top_k: 2

Mixture of LoRA Experts. Gating network routes per-token activations to top-K task adapters via softmax over hidden state. Backend-cross-validator rejects mlx. Live gating-kernel + per-token softmax routing lands in v0.67.1.

`soup adapters pr` — GitHub-shaped adapter pull requests

bash
soup adapters pr "Better politeness on EU support tickets" \
  --base-sha 9f2e... --adapter ./candidate \
  --eval ./eval_delta.json --samples ./sample_diffs.json \
  --output ./PR.md

PR = {base SHA, dataset diff, adapter weights, eval-delta report} rendered as review-friendly Markdown with eval-delta table + per-sample baseline/candidate diffs:

MetricBaselineCandidateΔ
judge_score7.48.2+0.8
retry_rate12.1%4.6%-7.5%

_md_table_escape neutralises \, |, \n, \r, \t` in operator-controlled cells. JSON output also available for v0.68 GitHub Action. Bounds: ≤64 deltas, ≤256 samples, ≤32 KiB per sample.

`soup lock` — shared run lockfile

bash
soup lock write --base-model meta-llama/Llama-3-8B \
  --base-sha <64hex> --dataset-sha <64hex> --env-hash <64hex> \
  --output soup.lock

soup lock show soup.lock
soup lock check --base-model ... --base-sha ... --dataset-sha ... --env-hash ...
# exit 3 on drift

Closure of (base_model_sha, dataset_sha, env_hash):

closure_sha = SHA256(base_sha || dataset_sha || env_hash)

Commit soup.lock to git so the whole team coordinates on the same reproducible run. soup_version + created_at are advisory only — legitimate operator upgrades don't trigger drift. Composes with v0.64 soup env lock (provides env_hash) and v0.64 soup plan (provides base/dataset hashes from config).

`soup adapters bisect` — binary search over training history

bash
soup adapters bisect ./ckpt-0500 ./ckpt-1000 ./ckpt-1500 ./ckpt-2000 \
  --eval-command "soup eval custom --checkpoint {ckpt} --suite ./regression.yaml"

Binary search over ordered checkpoint history. Operator supplies a shell template with {ckpt} placeholder — Soup uses shlex.split after shlex.quote(ckpt) (argv-list mode, no `shell=True`). Probes both endpoints first (short-circuits all-OK / all-broken), then ~log₂(n) midpoint probes. Exit 3 on BROKEN_AT.

Composes with v0.66 influence-blame: bisect finds the broken checkpoint, blame attributes it to specific training rows.

Numbers

+165 tests in v0.67.0 (10,836 → 11,021), 7 new test files. v0.67.1 lights up CMA-ES live eval-wiring, VeRA multi-tenant serve, and MoLE gating-kernel.

See also

  • [Adapters (v0.57)](/docs/adapters) — diff / merge / blame / branch / checkout, the foundation v0.67 builds on.
  • [Post-train x-rays](/docs/post-train-xrays) — v0.66 blame is what bisect hands off to.
  • [Pre-flight & tooling](/docs/preflight-tooling) — v0.64 soup env lock + soup plan are the inputs to soup lock.