Parsel is a BSD-licensed Python library to extract data from HTML, JSON, and XML documents.

It supports:

Find the Parsel online documentation at

Example (open online demo):

>>> from parsel import Selector
>>> text = """
                <h1>Hello, Parsel!</h1>
                    <li><a href="">Link 1</a></li>
                    <li><a href="">Link 2</a></li>
                <script type="application/json">{"a": ["b", "c"]}</script>
>>> selector = Selector(text=text)
>>> selector.css('h1::text').get()
'Hello, Parsel!'
>>> selector.xpath('//h1/text()').re(r'\w+')
['Hello', 'Parsel']
>>> for li in selector.css('ul > li'):
...     print(li.xpath('.//@href').get())
>>> selector.css('script::text').jmespath("a").get()
>>> selector.css('script::text').jmespath("a").getall()
['b', 'c']

