Archive
Posts Tagged ‘tags’
remove tags from HTML
July 13, 2016
Leave a comment
Problem
You have an HTML string and you want to remove all the tags from it.
Solution
Install the package “bleach” via pip. Then:
>>> import bleach
>>> html = "Her <h1>name</h1> was <i>Jane</i>."
>>> cleaned = bleach.clean(html, tags=[], attributes={}, styles=[], strip=True)
>>> html
'Her <h1>name</h1> was <i>Jane</i>.'
>>> cleaned
'Her name was Jane.'
Tip from here.
