Coaching Data
Coaching data is rosetta's mechanism for teaching LLMs about languages they weren't trained on. By providing grammar rules, dictionaries, and style notes alongside each translation request, you transform a general-purpose LLM into a context-aware translator for any language — including languages with zero existing MT support.
How It Works
When you set a pair's method to llm-coached, rosetta loads a coaching file from .rosetta/coaching/<locale>.json and injects its contents into every LLM prompt as part of the system message. The LLM sees your linguistic rules alongside the translation request, producing output that follows your grammar and terminology instead of guessing.
┌──────────────────────────────────────────────────────┐
│ System Message (cached across batches) │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Base translation rules │ │
│ │ + Register instructions │ │
│ │ + Grammar rules (from coaching data) │ │
│ │ + Dictionary entries (from coaching data) │ │
│ │ + Style notes (from coaching data) │ │
│ └──────────────────────────────────────────────────┘ │
├──────────────────────────────────────────────────────┤
│ User Message (per batch) │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Keys to translate (JSON) │ │
│ └──────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘
Because the coaching data is part of the system message, it benefits from prompt caching — providers like Anthropic and Google cache repeated system prefixes, so you only pay for coaching context once per session, not once per batch.
Coaching File Format
Create one JSON file per locale in .rosetta/coaching/:
{
"grammar_rules": [
"Plains Cree is polysynthetic — a single word can express what English needs a full sentence for",
"Animate/inanimate noun distinction affects verb conjugation",
"Use SRO (Standard Roman Orthography) unless script converter handles conversion",
"Verb stems are modified by prefixes and suffixes to indicate person, number, tense, and evidentiality"
],
"dictionary": {
"home": "kīwēwin",
"settings": "isi-nākatohkēwin",
"search": "nānātawāpahtam",
"welcome": "tānisi",
"submit": "ispīhci",
"cancel": "pōni"
},
"style_notes": "Use formal register. Preserve English technical terms in parentheses when no Cree equivalent exists. Avoid loanwords when a descriptive Cree expression exists."
}
Fields
| Field | Type | Required | Description |
|---|---|---|---|
grammar_rules | string[] | No | Array of grammar rules injected into the system prompt. Each rule should be a concise, actionable instruction the LLM can follow. |
dictionary | object | No | Key-value map of English term → target language term. Used for domain-specific vocabulary the LLM wouldn't know. |
style_notes | string | No | Free-form style instructions (register, tone, formality conventions). |
All fields are optional — you can start with just a dictionary and add grammar rules as you refine.
Fallback Behavior
If a pair is configured for llm-coached but no coaching file exists for that locale, rosetta falls back to the standard llm method with a console warning:
[INFO] No coaching data for "crk" at .rosetta/coaching/crk.json
Falling back to standard LLM method. Create coaching data for better results.
This means you can safely set "defaultMethod": "llm-coached" globally — languages with coaching data will use it, and the rest will get standard LLM translation without errors.
When to Use Coaching
| Scenario | Recommended Method |
|---|---|
| Tier 1 languages (French, Spanish, German) | llm or google-translate — LLMs already know these well |
| Tier 2 languages (Korean, Turkish, Thai) | llm with a register — LLMs handle these adequately with style guidance |
| Tier 3 languages (Plains Cree, Yoruba, Quechua) | llm-coached — LLMs need grammar rules and dictionaries |
| Conlangs (Klingon, Sindarin, Kryptonian) | llm-coached — LLMs have some training data but need corrections |
Building Good Coaching Data
Grammar Rules
Write rules as instructions, not descriptions. The LLM follows instructions better than it interprets linguistic theory.
// ❌ Descriptive (the LLM learns nothing actionable)
"Plains Cree has animate and inanimate noun classes"
// ✅ Instructive (the LLM knows what to do)
"When translating nouns, check whether the Cree equivalent is animate (NA) or inanimate (NI) — this affects which verb conjugation to use"
Dictionaries
Focus on domain-specific terms that the LLM would get wrong or invent. Don't bother with common words the LLM already handles — focus on the terms specific to your application's UI.
Style Notes
Be specific about register, formality, and conventions:
"style_notes": "Use formal register (vous-form in French). Preserve brand names untranslated. UI labels should be imperative mood ('Save', not 'Saves'). Maximum 40 characters for button text."
Testing Coached Translations
Use the MT Eval Harness to benchmark your coached translations against a reference corpus:
# Install the harness
pip install mt-eval-harness
# Run coached translations against your test corpus
mt-eval run --corpus data/crk-corpus.json --model google/gemini-2.5-pro
# Score the results
mt-eval test eval/logs/run_*.json
This gives you chrF++, BLEU, and exact match scores. Create multiple coaching file versions and compare — objective metrics beat subjective review.
See Also
- Translation Methods — the llm-coached method
- Support a Low-Resource Language — coaching in practice
- Plugin Specification — packaging coaching data in a plugin
- Quality Gate — how coached translations are validated
- Configuration — per-pair coaching config