いくつかのウェブサイトのHTMLをpython

Question

WebサイトのHTMLコードをtxtファイルに保存する必要があります。これは非常に簡単な演習ですが、これを行う関数があるため、これには疑問があります。

import urllib.request def get_html(url): f=open('htmlcode.txt','w') page=urllib.request.urlopen(url) pagetext=page.read() ## Save the html and later save in the file f.write(pagetext) f.close()

しかし、これは機能しません。

elyase · Accepted Answer

最も簡単な方法は、 rlretrieve を使用することです。

import urllib urllib.urlretrieve("http://www.example.com/test.html", "test.txt")

Python 3.xの場合、コードは次のとおりです。

import urllib.request urllib.request.urlretrieve("http://www.example.com/test.html", "test.txt")

Serhii · Answer

私が使う Python 3。
pip install requests --requestsライブラリをインストールした後、Webページをtxtファイルに保存できます。

import requests url = "https://stackoverflow.com/questions/24297257/save-html-of-some-website-in-a-txt-file-with-python" r = requests.get(url) with open('file.txt', 'w') as file: file.write(r.text)