LLM Engineering
Route every model through the Vercel AI Gateway with the AI SDK: one key, plain string model ids, a client-safe whitelist, and the gotchas that only bite in production.
1 file
Description
Route every model through the Vercel AI Gateway with the AI SDK: one key, plain string model ids, a client-safe whitelist, and the gotchas that only bite in production.
Adding chat or completion features that should reach many providers (OpenAI, Anthropic, Moonshot, Google) through one billing surface and one key, instead of wiring a client per provider.
With AI_GATEWAY_API_KEY set in the environment, the AI SDK resolves a plain string model id through the gateway automatically. You do not construct a provider client or set a baseURL.
import { streamText } from "ai";
const result = streamText({
model: "moonshotai/kimi-k2.5", // "provider/model", resolved by the gateway
messages,
});
return result.toUIMessageStreamResponse();
The id is the gateway's provider/model form. Verify it against the gateway /v1/models list before shipping; a typo silently falls through to an error at request time, not build time.
Put the offered models in a module with no secrets so both the picker UI and the route can import it. The route validates the requested id against that whitelist before it ever reaches the gateway, so a caller cannot force an arbitrary model.
export const CHAT_MODELS = [
{ id: "moonshotai/kimi-k2.5", label: "Kimi K2.5", credits: 1 },
{ id: "openai/gpt-5.4-mini", label: "GPT-5.4 Mini", credits: 1 },
] as const;
export function isChatModel(id: string) {
return CHAT_MODELS.some((m) => m.id === id);
}
Order the checks cheap-first so a rejected request never pays for inference: auth, then config present (503 if the gateway key is missing), then rate limit, then credit balance. Deduct credits in onFinish after a successful generation, never before, so a failed or rate-limited call charges nothing.
AI_GATEWAY_API_KEY in the deploy env builds fine and then 503s every request. Add the key to the host before shipping the feature, not after the first failed call.onFinish bills users for failed generations. Deduct on success only.Related
Added 2026-07-01. Back to the Skill Library.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.