name: refusal-fallback-handling
description: Use when building a product feature on top of an LLM that must stay usable when the model refuses, returns invalid JSON, or times out. Covers detection, retries, fallbacks, and user-facing messaging.
Refusal and Fallback Handling
When to trigger
Any user-facing feature where an LLM call can fail: the model declines the request, returns text where you expected structured output, hits a rate limit, or exceeds a latency budget.
Failure modes to handle
- Refusal: the model returns a polite decline instead of the task output.
- Malformed output: valid text, wrong shape (not the JSON your parser expects).
- Transport failure: rate limit, timeout, 5xx.
- Empty or truncated output: hit the max token limit mid-answer.
Layered strategy
- Constrain the output. When you need structure, use the provider's structured-output or tool-calling mode rather than parsing free text. It removes most malformed-output cases.
- Validate before you trust. Parse into a schema (for example with zod) and treat a parse failure as a retryable error, not a crash.
- Retry with backoff on transport errors only. Do not retry a refusal with the same prompt; it will refuse again. Reframe the request or fall back.
- Fall back deliberately. If the model cannot produce the result, return a smaller deterministic answer (a cached value, a template, a "we could not generate this, here is the raw input") rather than a stack trace.
User-facing rule
The user should never see a raw model refusal or a JSON parse error. Map every failure mode to a calm, specific message and, where possible, a next action.
Pitfalls
- Retrying a content refusal wastes tokens and latency. Detect refusals and route them to the fallback path immediately.
- A silent
catch that returns empty output hides real breakage. Log the failure mode with enough context to see which prompts fail.
- Timeouts that are longer than your UI's patience produce a spinner of death. Set the client timeout below the perceived-wait threshold and show the fallback.