Quality Gate

Every translation passes through a deterministic validation gate before it's written to disk. The quality gate catches common machine translation failure modes — no silent fallbacks, no garbage written to your locale files.

Validation Checks

Check	What It Catches	Gate Label
Empty/blank	Model returned empty string or whitespace	`[GATE] empty`
Source echo	Model returned the original English input	`[GATE] source-echo`
Hallucination loop	Repeated trigram patterns (e.g., `"Qo' Qo' Qo'"`)	`[GATE] hallucination`
Length inflation	Output is significantly longer than source	`[GATE] length`
Script compliance	Wrong script for the target locale	`[GATE] script`

Empty/Blank

Rejects translations that are empty strings, whitespace-only, or null. This catches models that return nothing for difficult keys.

Source Echo

Detects when the model returns the English source text instead of translating it. Common with short strings and under-specified prompts.

Hallucination Loop

Analyzes trigram (3-character) patterns in the output. If any trigram repeats more than a threshold number of times relative to the output length, the translation is rejected. This catches degenerate outputs like "Qo' Qo' Qo' Qo' Qo'".

Length Inflation

Rejects translations where the output length exceeds maxLengthRatio × source length (default: 4×). This catches model hallucinations that produce walls of text for a short input.

Configurable via maxLengthRatio in your config.

Script Compliance

For locales with a configured script field (e.g., "script": "cans" for Plains Cree Syllabics), validates that the output contains non-ASCII characters appropriate for the target script. Latin-only output for an Arabic, CJK, or Syllabics locale is rejected.

What Happens on Failure

The failing translation is logged to stderr with a [GATE] prefix, the key name, the reason, and a preview of the value
The key is not written to the locale file
The retry cascade kicks in (see below)

[GATE] hero.title: source-echo — "Welcome to our platform"
[GATE] nav.about: hallucination — "À À À À À À À À"

Retry Cascade

When a batch fails (JSON parse error or quality gate rejections), rosetta retries with progressively smaller batches:

Full batch (30 keys) → parse error
  └→ Half batch (15 keys) → 2 failures
      └→ Individual keys (1 each) → isolates the 2 problem keys

The retry budget is capped by maxRetries (default: 3, configurable per-language). This prevents runaway token spend on keys that consistently fail.

After exhausting retries, the problem keys are logged and skipped. They'll be retried on the next sync run.

Prompt Caching

The system message (register, grammar rules, style notes) is split from the user message (the keys to translate). This split is intentional:

The system message is identical across batches for a given locale
Providers like Anthropic and Google cache repeated system messages
Result: the first batch pays full token cost, subsequent batches pay only for the user message

This can significantly reduce token costs for projects with many batches.

Validation Checks​

Empty/Blank​

Source Echo​

Hallucination Loop​

Length Inflation​

Script Compliance​

What Happens on Failure​

Retry Cascade​

Prompt Caching​