Build Status PyPI Version Coverage report

Parsel is a library to extract data from HTML and XML using XPath and CSS selectors


  • Extract text using CSS or XPath selectors
  • Remove elements using CSS or XPath selectors
  • Regular expression helper methods

Example (open online demo):

>>> from parsel import Selector
>>> sel = Selector(text=u"""<html>
            <h1>Hello, Parsel!</h1>
                <li><a href="">Link 1</a></li>
                <li><a href="">Link 2</a></li>
>>> sel.css('h1::text').get()
'Hello, Parsel!'
>>> sel.css('h1::text').re('\w+')
['Hello', 'Parsel']
>>> for e in sel.css('ul > li'):
...     print(e.xpath('.//a/@href').get())

Indices and tables