Python与えられたwgetコマンドと同等

Question

このwgetコマンドと同じことを行うPython関数を作成しようとしています。

wget -c --read-timeout=5 --tries=0 "$URL"

-c-ダウンロードが中断された場合、中断したところから続行します。

--read-timeout=5-5秒以上新しいデータが入らない場合は、あきらめてもう一度やり直してください。 -cの場合、これは中断したところから再試行することを意味します。

--tries=0-永遠に再試行します。

タンデムで使用されるこれらの3つの引数により、ダウンロードが失敗することはありません。

Pythonスクリプトでこれらの機能を複製したいが、どこから始めればよいかわからない...

Eugene K · Accepted Answer

urllib.request は動作するはずです。 while（not done）ループで設定し、ローカルファイルが既に存在するかどうかを確認します。RANGEヘッダー付きのGETを送信する場合は、ローカルファイルのダウンロードの距離を指定します。エラーが発生するまで、必ずread（）を使用してローカルファイルに追加してください。

これは、ネットワークの再接続時に Python urllib2再開ダウンロードが機能しない可能性もあります

Blairg23 · Answer

wgetという名前のNice Pythonモジュールもあり、これは非常に使いやすいです。見つかったここ。

これは、設計の単純さを示しています。

>>> import wget >>> url = 'http://www.futurecrew.com/skaven/song_files/mp3/razorback.mp3' >>> filename = wget.download(url) 100% [................................................] 3841532 / 3841532> >> filename 'razorback.mp3'

楽しい。

ただし、wgetが機能しない場合（特定のPDFファイルに問題があります）、この解決策を試してください。

編集：outパラメーターを使用して、現在の作業ディレクトリの代わりにカスタム出力ディレクトリを使用することもできます。

>>> output_directory = <directory_name> >>> filename = wget.download(url, out=output_directory) >>> filename 'razorback.mp3'

Pujan Srivastava · Answer

import urllib2 attempts = 0 while attempts < 3: try: response = urllib2.urlopen("http://example.com", timeout = 5) content = response.read() f = open( "local/index.html", 'w' ) f.write( content ) f.close() break except urllib2.URLError as e: attempts += 1 print type(e)

Will Charlton · Answer

適切なオプションがwgetにコンパイルされていないLinuxのバージョンでは、このようなことをしなければなりませんでした。この例は、メモリ分析ツール「グッピー」をダウンロードするためのものです。それが重要かどうかはわかりませんが、ターゲットファイルの名前をURLターゲット名と同じにしておきました...

ここに私が思いついたものがあります：

python -c "import requests; r = requests.get('https://pypi.python.org/packages/source/g/guppy/guppy-0.1.10.tar.gz') ; open('guppy-0.1.10.tar.gz' , 'wb').write(r.content)"

これがワンライナーです。もう少し読みやすいです。

import requests fname = 'guppy-0.1.10.tar.gz' url = 'https://pypi.python.org/packages/source/g/guppy/' + fname r = requests.get(url) open(fname , 'wb').write(r.content)

これはtarballをダウンロードするために機能しました。パッケージを抽出し、ダウンロード後にダウンロードすることができました。

編集：

質問に対処するために、STDOUTに出力されるプログレスバーを使用した実装を次に示します。おそらくclintパッケージなしでこれを行うより移植性の高い方法がありますが、これは私のマシンでテストされ、正常に動作します：

#!/usr/bin/env python from clint.textui import progress import requests fname = 'guppy-0.1.10.tar.gz' url = 'https://pypi.python.org/packages/source/g/guppy/' + fname r = requests.get(url, stream=True) with open(fname, 'wb') as f: total_length = int(r.headers.get('content-length')) for chunk in progress.bar(r.iter_content(chunk_size=1024), expected_size=(total_length/1024) + 1): if chunk: f.write(chunk) f.flush()

Yohan Obadia · Answer

私がしばしばより簡単で堅牢だと感じる解決策は、Python内で端末コマンドを実行することです。あなたの場合：

import os url = 'https://www.someurl.com' os.system(f"""wget -c --read-timeout=5 --tries=0 "{url}"""")

pd shah · Answer

pyのように簡単：

class Downloder(): def download_manager(self, url, destination='Files/DownloderApp/', try_number="10", time_out="60"): #threading.Thread(target=self._wget_dl, args=(url, destination, try_number, time_out, log_file)).start() if self._wget_dl(url, destination, try_number, time_out, log_file) == 0: return True else: return False def _wget_dl(self,url, destination, try_number, time_out): import subprocess command=["wget", "-c", "-P", destination, "-t", try_number, "-T", time_out , url] try: download_state=subprocess.call(command) except Exception as e: print(e) #if download_state==0 => successfull download return download_state

Te ENe Te · Answer

多くのファイルをダウンロードしたい場合に備えて、スレッドの例を改善してみましょう。

import math import random import threading import requests from clint.textui import progress # You must define a proxy list # I suggests https://free-proxy-list.net/ proxies = { 0: {'http': 'http://34.208.47.183:80'}, 1: {'http': 'http://40.69.191.149:3128'}, 2: {'http': 'http://104.154.205.214:1080'}, 3: {'http': 'http://52.11.190.64:3128'} } # you must define the list for files do you want download videos = [ "https://i.stack.imgur.com/g2BHi.jpg", "https://i.stack.imgur.com/NURaP.jpg" ] downloaderses = list() def downloaders(video, selected_proxy): print("Downloading file named {} by proxy {}...".format(video, selected_proxy)) r = requests.get(video, stream=True, proxies=selected_proxy) nombre_video = video.split("/")[3] with open(nombre_video, 'wb') as f: total_length = int(r.headers.get('content-length')) for chunk in progress.bar(r.iter_content(chunk_size=1024), expected_size=(total_length / 1024) + 1): if chunk: f.write(chunk) f.flush() for video in videos: selected_proxy = proxies[math.floor(random.random() * len(proxies))] t = threading.Thread(target=downloaders, args=(video, selected_proxy)) downloaderses.append(t) for _downloaders in downloaderses: _downloaders.start()