Your First Scrape
Learn how to extract data from a website using Scrpy. We'll walk through a complete example from start to finish.
What We'll Build
In this tutorial, we'll scrape a product page and extract:
- Product title
- Price
- Description
- Image URL
- Availability status
Prerequisites
- A Scrpy account with an API key
- Basic understanding of CSS selectors
- Python, Node.js, or cURL installed
Step 1: Inspect the Target Page
Before writing any code, use your browser's Developer Tools (F12) to inspect the page structure. Identify the CSS selectors for each piece of data you want to extract.
Pro Tip: Right-click an element and select "Copy selector" to get its CSS selector automatically.
Step 2: Define Your Selectors
Map out the selectors for each data point:
{
"selectors": {
"title": "h1.product-title",
"price": ".price-current",
"description": ".product-description p",
"image": ".product-image img::attr(src)",
"availability": ".stock-status::text"
}
}Step 3: Make the API Request
import requests
API_KEY = #a5d6ff;">"sk_live_your_api_key"
response = requests.post(
#a5d6ff;">"https://api.scrpy.co/v1/scrape",
headers={
#a5d6ff;">"Authorization": f"Bearer {API_KEY}",
#a5d6ff;">"Content-Type": "application/json"
},
json={
#a5d6ff;">"url": "https://example-store.com/product/123",
#a5d6ff;">"selectors": {
#a5d6ff;">"title": "h1.product-title",
#a5d6ff;">"price": ".price-current",
#a5d6ff;">"description": ".product-description p",
#a5d6ff;">"image": ".product-image img::attr(src)",
#a5d6ff;">"availability": ".stock-status"
}
}
)
data = response.json()
print(data)const response = await fetch('https://api.scrpy.co/v1/scrape', {
method: 'POST',
headers: {
'Authorization': 'Bearer sk_live_your_api_key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
url: 'https://example-store.com/product/123',
selectors: {
title: 'h1.product-title',
price: '.price-current',
description: '.product-description p',
image: '.product-image img::attr(src)',
availability: '.stock-status'
}
})
});
const data = await response.json();
console.log(data);Step 4: Handle the Response
Scrpy returns a structured JSON response:
{
"success": true,
"data": {
"title": "Wireless Bluetooth Headphones",
"price": "$79.99",
"description": "Premium sound quality with active noise cancellation.",
"image": "https://example-store.com/images/headphones.jpg",
"availability": "In Stock"
},
"metadata": {
"url": "https://example-store.com/product/123",
"statusCode": 200,
"duration": 1234,
"timestamp": "2024-01-15T10:30:00Z"
}
}Common Issues
Empty Data Fields
If a field returns empty, verify your selector in browser DevTools. The page structure may have changed.
Dynamic Content Not Loading
Enable JavaScript rendering with "render": true for SPAs and dynamic pages.
Getting Blocked
Enable anti-bot bypass with "antiBot": true for protected sites.