The Shopify webhook 4-hour limit explained
Shopify's documented retry policy: 8 attempts over 4 hours, then permanent drop. What it means for you, and what production systems do about it.
Shopify's webhook delivery has a hard ceiling: 8 retry attempts spaced over 4 hours, then permanent drop. Here's exactly what that means for production apps and why it's the cause of most "we lost an order" incidents.
The current policy
Per the September 10, 2024 update to Shopify's webhook retry mechanism (replacing an earlier longer window):
"Webhooks will now be retried a total of 8 times over 4 hours using an exponential backoff schedule." — Shopify dev changelog
The current troubleshooting docs add detail on what counts as a failure:
"If your app doesn't respond within five seconds, then the delivery fails." A response with any HTTP status code outside the 2xx range is also counted as a failure. — Shopify dev docs
Two things matter:
- 8 attempts within the window
- 4 elapsed hours from the first failure to the last attempt
If your endpoint is down for more than 4 hours, every event in flight is gone — no further retries, no admin tool to replay.
Why this matters more than it sounds
Two subtle realities make the 4-hour limit dangerous:
- The email warning is easy to miss. Shopify sends a "your webhook is failing" email to your Partner emergency developer email when a subscription is at risk of removal. But that address often points at a Partner-account admin inbox no engineer monitors, so the warning lands and is ignored. Until you GET the webhook subscriptions endpoint, you have no in-product signal that it happened.
- The retry pattern is biased toward the start. Most attempts cluster in the first 30 minutes — exponential backoff means by hour 2 you've already burned 5–6 of 8 attempts. A long-running deploy or DB migration that lasts 2+ hours has a low chance of catching the remaining retries.
The 2024 change from 48 hours → 4 hours made this worse: any outage longer than half a workday now drops events with no recovery path.
What counts as a "failure"
Shopify counts these as failed deliveries:
- Connection refused / timeout (no TCP handshake completes)
- TLS handshake failure (cert expired, etc.)
- HTTP response with status 4xx or 5xx
- HTTP 200 with a body the validator rejects (rare)
- Response time > 5 seconds
The 5-second timeout is the one that catches teams off guard. If your webhook handler does anything synchronously that can take longer than 5s — DB writes during contention, a third-party API call, an image-resize — you'll occasionally see timeouts that get counted as failures even though the work eventually completed.
The fix is to respond 200 immediately and process async. Receive, persist the raw body to a queue, return 200, then process the work in a background job. This is the single biggest reliability improvement most apps can make.
What happens after the 4-hour mark
After the limit is hit, two things happen:
- The specific event is dropped permanently. No retries, no further attempts. There is no admin tool to replay it.
- The subscription may be auto-unsubscribed. If failures persist across multiple events, Shopify removes the subscription entirely. New events stop arriving.
To recover from #1, you need to reconcile against the Admin API and re-derive the missed event. To recover from #2, you need to re-register the subscription via Admin API. Both are doable but require active monitoring.
Real-world impact
For an active Shopify Plus store doing 2,000 orders/day:
- If your endpoint has a 99.9% uptime (~43 min/month down), you'll lose ~1 order/month permanently from outages alone.
- Add periodic deploys, occasional 502s, slow DB queries during peak — actual loss is closer to 5–20/month for most teams.
- That's 60–240 orders/year. At an avg cart of $80, $5K–$19K of orders that need manual recovery.
Worse, the recovery is retrospective. You won't know about losses until customers complain or you specifically audit.
Solutions, in order of effort
- Respond 200 immediately, process async. Eliminates the 5s-timeout-as-failure mode. Most impactful single change.
- Add a periodic Admin API reconciliation. Run hourly for orders. Detect what you missed.
- Monitor subscription health. Daily GET to the webhook subscriptions endpoint. Alert if any are missing.
- Use a webhook reliability layer. Tools like HookRescue sit between Shopify and your handler, extending the retry curve to 7 days, reconciling automatically, and surfacing every miss before it becomes a support ticket.