A technical guide to webhooks: the push vs pull model, HMAC-SHA256 signature verification, idempotency, the 5xx retry problem, delivery ordering guarantees, and how to build a reliable webhook handler that doesn't process events twice.
Webhooks are conceptually simple — a server POSTs to your URL when something happens — but getting them right in production requires handling a set of failure modes that aren't obvious until they bite you. This guide covers how webhooks work, why they fail, and how to build a handler that's reliable under the actual conditions of production systems.
The alternative to webhooks is polling: your application periodically asks "did anything change?" The problems with polling are well understood — you're either checking too frequently (wasting resources, hitting rate limits) or not frequently enough (delayed reactions to events).
Webhooks invert this. Instead of you asking "did anything change?", the external service notifies you when something changes. Push instead of pull.
The basic flow:
1. You register a webhook endpoint: POST https://api.stripe.com/v1/webhook_endpoints
{ url: "https://yourapp.com/webhooks/stripe", events: ["payment_intent.succeeded"] }
2. User pays on your app → Stripe processes the payment
3. Stripe sends:
POST https://yourapp.com/webhooks/stripe
{
"id": "evt_1PxK2rLkdIwHu7ixoZVBFpXs",
"type": "payment_intent.succeeded",
"data": { "object": { ... payment intent ... } }
}
4. Your handler processes the event, returns HTTP 200
5. Stripe marks the event as delivered
A key characteristic: the sender doesn't care what you do with the webhook. Stripe doesn't know if you sent the user a confirmation email, updated your database, or threw the event away. It only knows whether your server responded with a 2xx status code within its timeout window. Everything else is your problem.
This decoupling is both the power and the source of most webhook bugs.
If you expose a webhook endpoint at https://yourapp.com/webhooks/stripe, anyone can POST to it. Without verification, an attacker could forge payment events and trigger your fulfillment logic for free. This is not a theoretical risk.
Webhook providers solve this with HMAC signatures. Stripe, GitHub, and most others sign each request using HMAC-SHA256 with a secret key that only you and Stripe know.
How HMAC-SHA256 signing works:
signature = HMAC-SHA256(message, secret)
Stripe's implementation includes a timestamp to prevent replay attacks:
signed_payload = timestamp + "." + raw_request_body
signature = HMAC-SHA256(signed_payload, endpoint_signing_secret)
The signature goes in the Stripe-Signature header:
Stripe-Signature: t=1714500000,v1=abc123...,v1=def456...
(Multiple v1= values appear when Stripe rotates secrets — both old and new signatures are included during the transition period.)
Verification in Node.js (pseudocode):
import crypto from 'crypto';
function verifyStripeWebhook(rawBody, signatureHeader, secret) {
// Parse the header
const parts = signatureHeader.split(',');
const timestamp = parts.find(p => p.startsWith('t=')).slice(2);
const signatures = parts
.filter(p => p.startsWith('v1='))
.map(p => p.slice(3));
// Reconstruct the signed payload
const signedPayload = `${timestamp}.${rawBody}`;
// Compute expected signature
const expected = crypto
.createHmac('sha256', secret)
.update(signedPayload, 'utf8')
.digest('hex');
// Constant-time comparison (prevents timing attacks)
const isValid = signatures.some(sig =>
crypto.timingSafeEqual(
Buffer.from(sig, 'hex'),
Buffer.from(expected, 'hex')
)
);
if (!isValid) {
throw new Error('Invalid webhook signature');
}
// Check timestamp freshness (prevent replay attacks)
const tolerance = 300; // 5 minutes
if (Math.abs(Date.now() / 1000 - parseInt(timestamp)) > tolerance) {
throw new Error('Webhook timestamp too old');
}
return JSON.parse(rawBody);
}Critical implementation detail: you must verify against the raw request body bytes, not the parsed JSON. Your HTTP framework may parse the body before your handler sees it, reformatting whitespace or reordering keys. If you compute HMAC against JSON.stringify(req.body), you're computing it against a potentially different string than what was signed. In Express, use express.raw({ type: 'application/json' }) as middleware for your webhook route instead of express.json().
Stripe's SDK wraps all of this:
const event = stripe.webhooks.constructEvent(rawBody, signatureHeader, endpointSecret);Use the SDK if available. The manual implementation above is for understanding, not for production.
Webhook delivery is at-least-once. Stripe's documentation says explicitly: "Stripe attempts to deliver your webhooks for up to three days with an exponential back off." During those retries, the same event arrives multiple times.
If your handler is not idempotent — if running it twice on the same event causes double-sending an email, double-charging a user, or creating a duplicate record — you have a production reliability problem waiting to surface.
The standard idempotency pattern:
async function handleWebhookEvent(event) {
const eventId = event.id; // e.g., "evt_1PxK2rLkdIwHu7ixoZVBFpXs"
// Check if already processed
const existing = await db.webhookEvents.findUnique({
where: { eventId }
});
if (existing?.processedAt) {
console.log(`Skipping duplicate event ${eventId}`);
return; // Already processed — return successfully
}
// Mark as in-progress (prevents concurrent processing)
await db.webhookEvents.upsert({
where: { eventId },
create: { eventId, receivedAt: new Date() },
update: { receivedAt: new Date() }
});
// Process the event
await processEvent(event);
// Mark as complete
await db.webhookEvents.update({
where: { eventId },
data: { processedAt: new Date() }
});
}Store the processed event IDs in a database table. Check before processing. Mark as complete after. The check-and-mark should ideally be atomic (a database transaction with unique constraint on event_id) to handle concurrent delivery of the same event.
Here's the failure mode that trips up most implementations:
Stripe sends event → Your handler starts processing
Your handler calls Mailgun to send email → Mailgun returns 503
Your handler returns 500 to Stripe → Stripe retries
Stripe sends same event again → Your handler sends email successfully
Stripe sends same event AGAIN (because the first retry also failed for some reason)
→ User receives two emails
The problem: you're doing work before returning the 200. Any error in that work causes a non-2xx response, which triggers a retry, which may process the event again.
The correct pattern:
1. Receive webhook
2. Verify signature
3. Store the raw event in your database (this is fast and reliable)
4. Return 200 immediately
5. A background worker picks up the event and processes it asynchronously
// Webhook handler — fast path
app.post('/webhooks/stripe', rawBodyMiddleware, async (req, res) => {
let event;
try {
event = stripe.webhooks.constructEvent(
req.rawBody,
req.headers['stripe-signature'],
process.env.STRIPE_WEBHOOK_SECRET
);
} catch (err) {
return res.status(400).send(`Webhook verification failed: ${err.message}`);
}
// Store for async processing — idempotent upsert
await db.incomingWebhookEvents.upsert({
where: { eventId: event.id },
create: {
eventId: event.id,
eventType: event.type,
payload: event,
receivedAt: new Date(),
},
update: {} // Already exists — do nothing
});
res.json({ received: true }); // Return 200 fast
});
// Background worker — processes events asynchronously
async function processWebhookQueue() {
const unprocessed = await db.incomingWebhookEvents.findMany({
where: { processedAt: null },
orderBy: { receivedAt: 'asc' }
});
for (const record of unprocessed) {
try {
await processEvent(record.payload);
await db.incomingWebhookEvents.update({
where: { id: record.id },
data: { processedAt: new Date() }
});
} catch (err) {
await db.incomingWebhookEvents.update({
where: { id: record.id },
data: { lastError: err.message, errorCount: { increment: 1 } }
});
}
}
}This pattern makes your webhook handler nearly indestructible. The only thing that can cause a non-2xx response is a database failure when storing the raw event — and if your database is down, you have bigger problems.
Webhook providers do not guarantee events arrive in the order they occurred. Network routing, retry timing, and infrastructure quirks mean you might receive payment_intent.payment_failed before payment_intent.created. Or customer.subscription.updated (new plan) before the initial customer.subscription.created.
Your handler must be designed to handle out-of-order events. Strategies:
created timestamp to determine ordering when conflicts arise, not arrival order.stripe.subscriptions.retrieve(id)) to confirm before acting. The webhook is a notification; the API is the source of truth.Understanding how your providers retry helps you design your error handling:
| Provider | Initial timeout | Retry schedule | Max attempts |
|---|---|---|---|
| Stripe | 30 seconds | Exponential backoff over 3 days | ~17 |
| GitHub | 30 seconds | 3 retries at 5-minute intervals | 4 |
| Svix | 5 seconds | Exponential: 5s, 1m, 5m, 30m, 2h, 5h, 10h, 10h... | 11 |
| SendGrid | 3 seconds | Not documented explicitly | Varies |
If your endpoint consistently returns 5xx errors, Stripe will eventually disable the endpoint and notify you. You can manually resend events from the Stripe Dashboard — useful for recovering from outages.
For production systems, you need visibility into what's been received and what's failed.
What to log per event:
This gives you the ability to manually re-process failed events, audit what happened during an incident, and detect patterns (specific event types failing, a provider timing out consistently).
For teams using a message queue (BullMQ, SQS) as the async processing layer, the queue's built-in dead letter mechanism handles failed jobs after max retries — they land in a DLQ for manual inspection.
The challenge with local development: Stripe can't POST to localhost:3000. Two standard solutions:
ngrok: Creates a public tunnel to your local server.
ngrok http 3000
# Outputs: https://abc123.ngrok-free.app → localhost:3000Register https://abc123.ngrok-free.app/webhooks/stripe as your Stripe webhook URL for development. ngrok's dashboard at localhost:4040 shows every request and response, lets you inspect the raw bodies, and replay requests — invaluable for debugging.
Stripe CLI: The official approach for Stripe specifically.
stripe listen --forward-to localhost:3000/webhooks/stripeThe Stripe CLI creates a secure connection to Stripe's servers and forwards webhook events to your local handler. It also prints the webhook signing secret for your local session. Trigger test events:
stripe trigger payment_intent.succeededsmee.io: A free webhook proxy that works like ngrok for any provider, not just Stripe. Less feature-rich but no installation required.
Webhooks, background jobs, third-party integrations — the infrastructure around your core product is where reliability is built or lost. Hunchbite's developer experience service covers integration architecture, async processing patterns, and the operational practices that keep event pipelines running under real conditions.
Call +91 90358 61690 · Book a free call · Contact form
If this guide resonated with your situation, let's talk. We offer a free 30-minute discovery call — no pitch, just honest advice on your specific project.
How to set up Drizzle ORM with PostgreSQL from scratch — schema definition, migrations, query patterns, connection pooling, and the configuration decisions that matter in production Next.js applications.
11 min readguideA technical guide to database indexes: B-tree internals, composite index column ordering, covering indexes, partial indexes, the write cost of over-indexing, EXPLAIN ANALYZE interpretation, and the common indexing mistakes that degrade production performance.
14 min read