pythonを使用してhttps経由でPDFファイルをダウンロードする方法

Question

pythonスクリプトを記述しています。これは、URLで指定された形式に従ってPDFファイルをローカルに保存します。

https://Hostname/saveReport/file_name.pdf #saves the content in PDF file.

pythonスクリプトを介してこのURLを開いています：

 import webbrowser webbrowser.open("https://Hostname/saveReport/file_name.pdf")

URLには多くの画像とテキストが含まれています。 このURLが開いたら、pythonスクリプトを使用してPDF形式でファイルを保存します。

これは私がこれまでに行ったことです。
コード1：

import requests url="https://Hostname/saveReport/file_name.pdf" #Note: It's https r = requests.get(url, auth=('usrname', 'password'), verify=False) file = open("file_name.pdf", 'w') file.write(r.read()) file.close()

コード2：

 import urllib2 import ssl url="https://Hostname/saveReport/file_name.pdf" context = ssl._create_unverified_context() response = urllib2.urlopen(url, context=context) #How should i pass authorization details here? html = response.read()

上記のコードで私が取得しています：urllib2.HTTPError：HTTP Error 401：Unauthorized

コード2を使用する場合、認証の詳細をどのように渡すことができますか？

Joran Beasley · Accepted Answer

これはうまくいくと思います

import requests url="https://Hostname/saveReport/file_name.pdf" #Note: It's https r = requests.get(url, auth=('usrname', 'password'), verify=False,stream=True) r.raw.decode_content = True with open("file_name.pdf", 'wb') as f: shutil.copyfileobj(r.raw, f)

baji · Answer

これを行う1つの方法は次のとおりです。

import urllib3 urllib3.disable_warnings() url = r"https://websitewithfile.com/file.pdf" fileName = r"file.pdf" with urllib3.PoolManager() as http: r = http.request('GET', url) with open(fileName, 'wb') as fout: fout.write(r.data)

Peter Zagubisalo · Answer

一部のファイル-少なくともtarアーカイブ（または他のすべてのファイル）の場合、pipを使用できます。

import sys from subprocess import call, run, PIPE url = "https://blabla.bla/foo.tar.gz" call([sys.executable, "-m", "pip", "download", url], stdout=PIPE, stderr=PIPE)

しかし、setup.pyを含むアーカイブではないファイルに対してpipがエラーを発生させるため、ダウンロードが他の方法で成功したことを確認する必要があります。したがって、stderr = PIPE（または、サブプロセスのエラーメッセージを解析して、ダウンロードが成功したかどうかを判断できます。）。