Webhooks

Receive scraping results automatically as soon as they're ready. No polling required.

Overview

Webhooks allow Scrpy to push data to your server in real-time. When a scrape completes, we'll send an HTTP POST request to your specified URL with the results.

Tip: Webhooks are ideal for jobs and schedules where you don't want to poll for results.

Webhook Payload

Scrpy sends a POST request with the following JSON payload:

javascript
{
  "event": "scrape.completed",
  "timestamp": "2024-01-15T10:30:00Z",
  "data": {
    "jobId": "job_abc123",
    "url": "https://example.com/product",
    "results": {
      "title": "Product Name",
      "price": "$99.99"
    },
    "metadata": {
      "statusCode": 200,
      "duration": 1234
    }
  }
}

Event Types

EventDescription
scrape.completedSingle scrape or job URL completed successfully
scrape.failedScrape failed after retries
job.completedAll URLs in a job finished processing
job.failedJob failed to complete
schedule.runScheduled scrape completed

Signature Verification

All webhook requests include a signature header for verification. Always verify signatures in production.

Headers
X-Scrpy-Signature: sha256=abc123...
X-Scrpy-Timestamp: 1705312200
Node.js Verification
import crypto from 'crypto';

function verifyWebhook(payload, signature, secret) {
  const expected = crypto
    .createHmac('sha256', secret)
    .update(payload)
    .digest('hex');
  
  return crypto.timingSafeEqual(
    Buffer.from(signature),
    Buffer.from(`sha256=${expected}`)
  );
}

Best Practices

Return 200 Quickly

Respond with HTTP 200 immediately, then process the data asynchronously. Scrpy will retry on non-2xx responses.

Handle Duplicates

Use the event ID or job ID to deduplicate. Network issues may cause duplicate deliveries.

Verify Signatures

Always verify webhook signatures to ensure requests are from Scrpy.

Use HTTPS

Webhook endpoints must use HTTPS to protect data in transit.

Retry Policy

If your endpoint returns a non-2xx status or times out (30s), Scrpy will retry with exponential backoff:

  • Retry 1: After 1 minute
  • Retry 2: After 5 minutes
  • Retry 3: After 30 minutes
  • Retry 4: After 2 hours
  • Retry 5: After 24 hours (final)

Related