Basic Selectors

Selector	Example	Selects
`element`	`h1`	All <h1> elements
`.class`	`.product-title`	Elements with class="product-title"
`#id`	`#main-content`	Element with id="main-content"
`[attribute]`	`[data-price]`	Elements with data-price attribute
`[attr=value]`	`[type="submit"]`	Elements where type="submit"

Combinators

Combinator	Example	Selects
Descendant (space)	`div p`	All p inside div
Child (>)	`ul > li`	Direct li children of ul
Adjacent (+)	`h1 + p`	First p after h1
Sibling (~)	`h1 ~ p`	All p siblings after h1

Pseudo-Selectors

Pseudo	Example	Selects
:first-child	`li:first-child`	First li in parent
:last-child	`li:last-child`	Last li in parent
:nth-child(n)	`tr:nth-child(2)`	Second tr in parent
:not()	`p:not(.hidden)`	p without class hidden
:contains()	`a:contains("Buy")`	Links containing "Buy"

Scrpy Selector Extensions

Scrpy extends standard CSS selectors with powerful extraction modifiers:

Extension	Example	Extracts
::text	`h1::text`	Text content of h1
::attr(name)	`a::attr(href)`	href attribute value
::html	`.content::html`	Inner HTML of element
::all	`li::all`	Array of all matches

Real-World Examples

E-commerce Product

{
  "selectors": {
    "title": "h1.product-name::text",
    "price": "[data-price]::attr(data-price)",
    "originalPrice": ".price-was::text",
    "images": ".gallery img::attr(src)::all",
    "description": ".description p::text",
    "rating": ".star-rating::attr(data-rating)",
    "reviewCount": ".review-count::text"
  }
}

Job Listing

{
  "selectors": {
    "title": "h1.job-title::text",
    "company": ".company-name a::text",
    "location": ".job-location::text",
    "salary": ".salary-range::text",
    "description": ".job-description::html",
    "requirements": ".requirements li::text::all",
    "postedDate": "time::attr(datetime)"
  }
}

News Article

{
  "selectors": {
    "headline": "article h1::text",
    "author": "[rel='author']::text",
    "publishDate": "article time::attr(datetime)",
    "content": "article .body p::text::all",
    "tags": ".tags a::text::all",
    "image": "article figure img::attr(src)"
  }
}

Tips for Better Selectors

Be Specific, Not Fragile

Use semantic classes over deeply nested selectors. .product-price is better than div > div > span:nth-child(2).

Use Data Attributes

Sites often store structured data in attributes like data-price or data-sku. These are more reliable than text content.

Test in DevTools First

Use document.querySelectorAll('your-selector') in the browser console to test selectors before using them with Scrpy.

Understanding Selectors