メモリにgzip圧縮されたファイルをダウンロードして解凍しますか？

Question

保存する前に、urllibを使用してファイルをダウンロードし、メモリ内のファイルを解凍したいと思います。

これは私が今持っているものです：

response = urllib2.urlopen(baseURL + filename) compressedFile = StringIO.StringIO() compressedFile.write(response.read()) decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode='rb') outfile = open(outFilePath, 'w') outfile.write(decompressedFile.read())

これにより、空のファイルが書き込まれます。どうすれば自分の目標を達成できますか？

更新された回答：

#! /usr/bin/env python2 import urllib2 import StringIO import gzip baseURL = "https://www.kernel.org/pub/linux/docs/man-pages/" # check filename: it may change over time, due to new updates filename = "man-pages-5.00.tar.gz" outFilePath = filename[:-3] response = urllib2.urlopen(baseURL + filename) compressedFile = StringIO.StringIO(response.read()) decompressedFile = gzip.GzipFile(fileobj=compressedFile) with open(outFilePath, 'w') as outfile: outfile.write(decompressedFile.read())

crayzeewulf · Accepted Answer

compressedFileに書き込んだ後、gzip.GzipFile()に渡す前に、先頭を探す必要があります。それ以外の場合は、gzipモジュールによって最後から読み取られ、空のファイルとして表示されます。下記参照：

#! /usr/bin/env python import urllib2 import StringIO import gzip baseURL = "https://www.kernel.org/pub/linux/docs/man-pages/" filename = "man-pages-3.34.tar.gz" outFilePath = "man-pages-3.34.tar" response = urllib2.urlopen(baseURL + filename) compressedFile = StringIO.StringIO() compressedFile.write(response.read()) # # Set the file's current position to the beginning # of the file so that gzip.GzipFile can read # its contents from the top. # compressedFile.seek(0) decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode='rb') with open(outFilePath, 'w') as outfile: outfile.write(decompressedFile.read())

lyschoening · Answer

Python 3を使用している場合、同等の答えは次のとおりです。

import urllib.request import io import gzip response = urllib.request.urlopen(FILE_URL) compressed_file = io.BytesIO(response.read()) decompressed_file = gzip.GzipFile(fileobj=compressed_file) with open(OUTFILE_PATH, 'wb') as outfile: outfile.write(decompressed_file.read())

Chih-Hsuan Yen · Answer

Python 3.2以上の場合、人生はずっと楽になります：

#!/usr/bin/env python3 import gzip import urllib.request baseURL = "https://www.kernel.org/pub/linux/docs/man-pages/" filename = "man-pages-4.03.tar.gz" outFilePath = filename[:-3] response = urllib.request.urlopen(baseURL + filename) with open(outFilePath, 'wb') as outfile: outfile.write(gzip.decompress(response.read()))

歴史に興味がある人は、 https://bugs.python.org/issue3488 と https://hg.python.org/cpython/rev/3fa0a9553402 を参照してください。。