Saturday, February 13, 2010

Python and CSS Selector

Python has a lot of awesome modules for working with html documents. My favorite one is lxml modules. Last time I wrote about xpath using that module.
Now, it is time for css selector using lxml.cssselect.

Straight to an example of using it:

from lxml.cssselect import CSSSelector
css_sel = CSSSelector('table.tableSchedule tr')
html_doc = lxml.html.fromstring(some_web_html_string)
row_els = css_sel(html_doc)
print row_els


Sweet and simple.

I am still playing around with it and found that using direct descendant css code is not possible. Something like the following:

css_sel = CSSSelector('table.tableSchedule > tobdy > tr > td')


Please leave a comment if you are able to use ">" in the css code part.

No comments:

Post a Comment