`soup advise` — decide before you train
Before you touch a GPU, ask: do you actually need to fine-tune? Or is your problem better solved with prompt engineering, RAG, DPO instead of SFT, or GRPO over rewards?
soup advise (v0.54.0) is a pre-flight decision engine. It classifies your task, profiles your dataset, and emits a ranked verdict across:
PROMPT_ENG— your data is too small / your goal is solvable with a better promptRAG— high-variance factual recall is better handled with retrievalSFT— supervised fine-tuning is the right baselineDPO— you already have preference pairs (chosen/rejected)GRPO— you have reasoning traces + ≥ 500 rows, RL over verifiable rewards wins
Usage
soup advise data.jsonl --goal "polite customer support chat"Output:
SFT (recommended)
why:
- 4,213 rows (above _MIN_ROWS_FOR_TRAINING=50)
- no preference pairs detected
- no reasoning traces — GRPO ruled out
- tone-shift goal — SFT is the right baseline
next: soup autopilot --data data.jsonl --task sft --goal "polite customer support chat"Heuristics
- Task classification: keyword + structural signals.
tool_callsfield → tool_use;<think>...</think>blocks → reasoning; chat-shaped messages → input-extraction. The--goalstring carries ~10× the row weight when classifying. - Dataset profile: row count, avg input/output chars, type-token diversity, label variance,
has_chosen_rejected,has_reasoning_traces. Capped at 2,000-row sample for speed. - Verdict rubric:
- preference pairs detected → DPO
- reasoning traces + ≥ 500 rows → GRPO
- < 50 rows → PROMPT_ENG (training will overfit)
- high-variance factual → RAG
- default → SFT
`--probe` — put real numbers on the ROI estimate
soup advise data.jsonl --goal "..." --probeRuns a 100-step LoRA probe and a held-out zero-shot / few-shot / RAG baseline, bounded to [-1, 1] per-method delta. The 600-second timeout is a hard wall.
`--record` — cross-project history
soup advise data.jsonl --goal "..." --recordAppends a frozen HistoryEntry to ~/.soup/advise_history.jsonl (atomic, file-locked via fcntl.flock on POSIX / msvcrt.locking on Windows, 64 KB per-line cap, 16 MiB file cap, 10k row cap). Future verdicts read this back via summarise_history so the engine learns across projects.
Override the path with SOUP_ADVISE_HISTORY_PATH — containment-checked to $HOME / $CWD / tempfile.gettempdir(). Default file perms: 0o600.
Subcommands
soup advise run data.jsonl --goal "..." # explicit (also the default)
soup advise explain # print the full rubric
soup advise compare a.jsonl b.jsonl # which dataset is better for fine-tuning?The top-level argv preprocessor _rewrite_advise_argv injects run so soup advise data.jsonl works without typing the subcommand.
See also
- [Autopilot](/docs/autopilot) — the literal next step after
soup advise - [Eval design](/docs/eval-design) — turn your data into evals
- [Trace-to-preference](/docs/trace-to-preference) — distill production traffic into DPO pairs