Retry-After HTTP Header — Reference

What it does

Retry-After tells the client how long to wait before making another request. It's an instruction from the server: "don't come back until this time has passed."

Most commonly seen on:

429 Too Many Requests — rate limiting; the client hit the request limit
503 Service Unavailable — the server is temporarily down or overloaded
3xx redirects — sometimes used to indicate when a temporarily moved resource will be available again

Syntax

Two formats are accepted:

Relative (seconds from now):

Retry-After: 120

Wait 120 seconds.

Absolute (specific date/time):

Retry-After: Fri, 31 Dec 2026 23:59:59 GMT

Don't retry until this timestamp.

Relative seconds are simpler and more common. Absolute dates are useful when the retry time is tied to a specific scheduled event (maintenance window ending, rate limit reset at midnight, etc.).

Rate limiting (429)

The most common use case. When a client exceeds a rate limit, the server responds with 429 Too Many Requests and Retry-After indicating when the rate limit resets:

HTTP/1.1 429 Too Many Requests
Retry-After: 60
Content-Type: application/json

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests. Try again in 60 seconds.",
  "retry_after": 60
}

This tells the client exactly when to retry — no need for exponential backoff guessing or polling.

Service unavailability (503)

During planned maintenance or unexpected downtime, 503 Service Unavailable with Retry-After gives clients a concrete retry time:

HTTP/1.1 503 Service Unavailable
Retry-After: 3600
Content-Type: text/plain

Service maintenance in progress. Expected back online at 14:00 UTC.

Clients and monitoring systems that respect Retry-After will back off automatically rather than hammering the service.

Implementing rate limiting

A well-implemented rate limiting response typically combines several headers:

HTTP/1.1 429 Too Many Requests
Retry-After: 47
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1719043847
Content-Type: application/json

Retry-After — how long to wait (seconds)
X-RateLimit-Limit — total requests allowed per window
X-RateLimit-Remaining — requests remaining in current window
X-RateLimit-Reset — Unix timestamp when the window resets

The X-RateLimit-* headers aren't standardised (no RFC) but are widely adopted conventions. Retry-After is the spec-defined one.

Client behaviour

Well-behaved clients should:

Check for Retry-After on 429 and 503 responses
Parse the value (seconds or date)
Wait the specified duration before retrying
Implement jitter — add a small random delay to avoid thundering herd (all clients retrying simultaneously exactly when the rate limit resets)

import time
import random

def request_with_retry(url):
    response = requests.get(url)
    if response.status_code == 429:
        retry_after = int(response.headers.get('Retry-After', 60))
        # Add jitter: wait between retry_after and retry_after + 5 seconds
        time.sleep(retry_after + random.uniform(0, 5))
        return request_with_retry(url)
    return response

Common mistakes and gotchas

Not setting Retry-After on 429 responses. Without it, clients have to guess when to retry — they'll either hammer you immediately or wait longer than necessary. Always include it.

Setting Retry-After: 0. Zero means "retry immediately" — which defeats the purpose of a rate limit. Set a meaningful delay. Even Retry-After: 1 is rarely useful.

Using absolute dates without clock sync. If the client's clock differs from the server's, an absolute Retry-After date can be confusing. Relative seconds sidestep clock skew entirely.

Ignoring Retry-After in monitoring/alerting systems. Automated monitoring that doesn't respect Retry-After will hammer a struggling service with retry attempts, making the outage worse. Configure your monitoring tools to back off on 429 and 503 responses.

Real-world examples

Rate limit with seconds:

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0

{"error": "rate_limit_exceeded"}

Maintenance window:

HTTP/1.1 503 Service Unavailable
Retry-After: Wed, 25 Jun 2026 02:00:00 GMT
Content-Type: text/plain

Scheduled maintenance until 02:00 UTC.

Temporary redirect with timing:

HTTP/1.1 307 Temporary Redirect
Location: https://maintenance.example.com
Retry-After: 1800

FAQ

Is Retry-After mandatory on 429 responses?

Not mandatory by spec, but strongly recommended in practice. A 429 without Retry-After forces clients to implement their own backoff strategy (which they'll do differently — some too aggressively, some too conservatively). Providing Retry-After makes the system more predictable and kinder to both client developers and your own infrastructure.

Can Retry-After be used on successful responses?

Technically yes — the spec doesn't restrict it to error responses. In practice, it's almost exclusively used on 429, 503, and occasionally 3xx. Using it on 200 would be unusual and confusing.

Does Retry-After interact with caching?

Not directly. Caches don't cache 429 or 503 responses by default. Retry-After on a 503 is processed by the client, not the cache. There's no caching behaviour tied to Retry-After.

Fun fact

Retry-After was one of the original HTTP/1.1 headers (1997), but its pairing with 429 Too Many Requests came much later — 429 wasn't defined until RFC 6585 in 2012, fifteen years after Retry-After was specified. For over a decade, Retry-After existed without the most natural status code to use it with. APIs dealing with rate limiting in that era had to improvise with 503 or non-standard status codes, which is part of why the X-RateLimit-* family of custom headers developed independently before 429 was formalised.

Retry-After general response