If you only want the text part of a document or tag, you can use the get_text() method. It returns all the text in a document or beneath a tag, as a single Unicode string:
markup = '<a href="http://example.com/">\nI linked to <i>example.com</i>\n</a>'soup = BeautifulSoup(markup)soup.get_text()u'\nI linked to example.com\n'soup.i.get_text()u'example.com'
You can specify a string to be used to join the bits of text together:
# soup.get_text("|")u'\nI linked to |example.com|\n'
You can tell Beautiful Soup to strip whitespace from the beginning and end of each bit of text: