Lesson 3 · 5 min
Retries & Backoff
Idempotency and exponential backoff for API errors.
Retry on 429 and 5xx with exponential backoff and jitter. Make calls idempotent by adding a stable request_id to tool calls so a retry doesn't double-execute.
Production scenario
Real-world example: Quote engine for B2B pricing
A pricing engine returns quotes via an LLM tool that hits an internal pricing service. Network blips happen. Without idempotency, a retry could double-execute an audit log entry and confuse the analyst dashboard.
const requestId = uuid();
async function getQuote(args: QuoteArgs) {
return retryWithBackoff(
() => pricingApi.post("/quote", { ...args, request_id: requestId }),
{ codes: [429, 500, 502, 503, 504], maxRetries: 4, jitter: true },
);
}The pricing service uses request_id to detect duplicates and short-circuits the second call. Retries are safe.
Why this matters: the retry policy and idempotency contract have to be designed together. Bare retry + non-idempotent endpoint = double-billed customers and stressed on-call.
Knowledge points in this lesson
- Retry on 429 and 5xx with backoff
- Use exponential backoff with jitter
- Idempotent calls via stable request IDs
- Don't retry 4xx other than 429
- Cap retries to avoid runaway cost
Quick check
Context & ReliabilitySelect one
Which HTTP status codes should trigger a retry with exponential backoff?
