BeautifulSoup：contents []を単一の文字列として取得

Question

スープオブジェクトの内容全体を単一の文字列として取得するエレガントな方法を知っている人はいますか？

現時点では、もちろんリストであるcontentsを取得し、それを繰り返し処理しています。

notices = soup.find("div", {"class" : "middlecontent"}) con = "" for content in notices.contents: con += str(content) print con

ありがとう！

F&#225;bio Diniz · Accepted Answer

contents = str(notices)はどうですか？

または、contents = notices.renderContents()で、divタグが非表示になります。

Fr&#233;d&#233;ric Hamidi · Answer

join（）メソッドを使用できます：

notices = soup.find("div", {"class": "middlecontent"}) contents = "".join([str(item) for item in notices.contents])

または、ジェネレータ式を使用します。

contents = "".join(str(item) for item in notices.contents)

Spouk · Answer

#!/usr/bin/env python # coding: utf-8 __author__ = 'spouk' import BeautifulSoup import requests def parse_contents_href(url, url_args=None, check_content_find=None, tag='a'): """ parse href contents url and find some text in href contents [ for example ] """ html = requests.get(url, params=url_args) page = BeautifulSoup.BeautifulSoup(html.text) alllinks = page.findAll(tag, href=True) result = check_content_find and filter( lambda x: check_content_find in x['href'], alllinks) or alllinks return result and "".join(map(str, result)) or False url = 'https://vk.com/postnauka' print parse_contents_href(url)

zjk · Answer

しかし、リストは再帰的であるため、...これでうまくいくと思います。
私はPythonを初めて使用するため、コードが少し奇妙に見える場合があります

getString = lambda x: \ x if type(x).__name__ == 'NavigableString' \ else "".join( \ getString(t) for t in x) contents = getString(notices)