Parsel is a BSD-licensed Python library to extract and remove data from HTML and XML using XPath and CSS selectors, optionally combined with regular expressions.

Find the Parsel online documentation at

Example (open online demo):

>>> from parsel import Selector
>>> selector = Selector(text=u"""<html>
            <h1>Hello, Parsel!</h1>
                <li><a href="">Link 1</a></li>
                <li><a href="">Link 2</a></li>
>>> selector.css('h1::text').get()
'Hello, Parsel!'
>>> selector.xpath('//h1/text()').re(r'\w+')
['Hello', 'Parsel']
>>> for li in selector.css('ul > li'):
...     print(li.xpath('.//@href').get())

