BeautifulSoup：タイプ 'Response'のオブジェクトにはlen（）がありません

Question

問題：スクリプトを実行しようとすると、BeautifulSoup(html, ...)がエラーメッセージ "TypeError：type 'object of type' Response 'has no no len（）。動作しません。

import requests url = 'http://vineoftheday.com/?order_by=rating' response = requests.get(url) html = response.content soup = BeautifulSoup(html, "html.parser")

Matvei Nazaruk · Accepted Answer

response.contentを取得しています。ただし、応答本文をバイトとして返します（ docs ）。ただし、strをBeautifulSoupコンストラクターに渡す必要があります（ docs ）。したがって、コンテンツを取得する代わりにresponse.textを使用する必要があります。

Jorge · Answer

HTMLテキストを直接渡すようにしてください

soup = BeautifulSoup(html.text)

Moshe G · Answer

requests.get('https://example.com')を使用してHTMLを取得している場合は、requests.get('https://example.com').textを使用する必要があります。

Atul · Answer

「応答」で応答コードのみを取得し、セキュリティのために常にブラウザヘッダーを使用します。そうしないと、多くの問題に直面します

デバッガコンソールネットワークセクション「ヘッダー」UserAgentでヘッダーを検索

試してみる

import requests from bs4 import BeautifulSoup from fake_useragent import UserAgent url = 'http://www.google.com' headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'} response = requests.get(quote_page, headers=headers).text soup = BeautifulSoup(response, 'html.parser') print(soup.prettify())

Ozcar Nguyen · Answer

それは私のために働いた：

soup = BeautifulSoup(requests.get("your_url").text)

現在、以下のこのコードの方が優れています（lxmlパーサーを使用）：

import requests from bs4 import BeautifulSoup soup = BeautifulSoup(requests.get("your_url").text, 'lxml')