Pythonでファイルサイズを変換するより良い方法

Question

ファイルを読み取り、サイズをバイト単位で返すライブラリを使用しています。

このファイルサイズは、エンドユーザーに表示されます。ユーザーが理解しやすくするために、ファイルサイズを1024.0 * 1024.0で除算することで明示的にMBに変換しています。もちろんこれは動作しますが、Pythonでこれを行うより良い方法があるのではないかと思っています。

良いことに、私はおそらく、必要なタイプに応じてサイズを操作できるstdlib関数を意味します。 MBを指定した場合と同様に、1024.0 * 1024.0で自動的に除算します。これらの線の上にある。

Lennart Regebro · Accepted Answer

hurry.filesize があり、バイト単位でサイズを取得し、それがあればニース文字列を作成します。

>>> from hurry.filesize import size >>> size(11000) '10K' >>> size(198283722) '189M'

または、1K == 1000（ほとんどのユーザーが想定している）が必要な場合：

>>> from hurry.filesize import size, si >>> size(11000, system=si) '11K' >>> size(198283722, system=si) '198M'

IECのサポートもあります（ただし、文書化されていません）：

>>> from hurry.filesize import size, iec >>> size(11000, system=iec) '10Ki' >>> size(198283722, system=iec) '189Mi'

Awesome Martijn Faassenによって作成されているため、コードは小さく、明確で、拡張可能です。独自のシステムを書くのはとても簡単です。

以下がその1つです。

mysystem = [ (1024 ** 5, ' Megamanys'), (1024 ** 4, ' Lotses'), (1024 ** 3, ' Tons'), (1024 ** 2, ' Heaps'), (1024 ** 1, ' Bunches'), (1024 ** 0, ' Thingies'), ]

次のように使用します：

>>> from hurry.filesize import size >>> size(11000, system=mysystem) '10 Bunches' >>> size(198283722, system=mysystem) '189 Heaps'

James Sapam · Answer

私が使用するものは次のとおりです。

import math def convert_size(size_bytes): if size_bytes == 0: return "0B" size_name = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB") i = int(math.floor(math.log(size_bytes, 1024))) p = math.pow(1024, i) s = round(size_bytes / p, 2) return "%s %s" % (s, size_name[i])

注意：サイズはバイト単位で送信する必要があります。

ccpizza · Answer

_1024 * 1024_のサイズ除数の代わりに、_<<_ ビット単位のシフト演算子を使用できます。つまり、メガバイトを取得する_1<<20_、ギガバイトを取得する_1<<30_等.

最も単純なシナリオでは、たとえば定数MBFACTOR = float(1<<20)は、バイトで使用できます。つまり、_megas = size_in_bytes/MBFACTOR_。

通常、必要なのはメガバイトだけです。そうでなければ、次のようなものを使用できます。

_# bytes pretty-printing UNITS_MAPPING = [ (1<<50, ' PB'), (1<<40, ' TB'), (1<<30, ' GB'), (1<<20, ' MB'), (1<<10, ' KB'), (1, (' byte', ' bytes')), ] def pretty_size(bytes, units=UNITS_MAPPING): """Get human-readable file sizes. simplified version of https://pypi.python.org/pypi/hurry.filesize/ """ for factor, suffix in units: if bytes >= factor: break amount = int(bytes / factor) if isinstance(suffix, Tuple): singular, multiple = suffix if amount == 1: suffix = singular else: suffix = multiple return str(amount) + suffix print(pretty_size(1)) print(pretty_size(42)) print(pretty_size(4096)) print(pretty_size(238048577)) print(pretty_size(334073741824)) print(pretty_size(96995116277763)) print(pretty_size(3125899904842624)) ## [Out] ########################### 1 byte 42 bytes 4 KB 227 MB 311 GB 88 TB 2 PB _

Pavan Gupta · Answer

これがサイズを計算するコンパクト関数です

def GetHumanReadable(size,precision=2): suffixes=['B','KB','MB','GB','TB'] suffixIndex = 0 while size > 1024 and suffixIndex < 4: suffixIndex += 1 #increment the index of the suffix size = size/1024.0 #apply the division return "%.*f%s"%(precision,size,suffixes[suffixIndex])

より詳細な出力およびその逆の操作については、以下を参照してください： http://code.activestate.com/recipes/578019-bytes-to-human-human-to-bytes-converter/

Peter F · Answer

必要なものが既にわかっている場合は、コードの1行でファイルサイズを印刷するための迅速で比較的読みやすい方法については、以下を参照してください。これらのワンライナーは、上記の@ ccpizzaによる素晴らしい回答と、ここで読んだいくつかの便利なフォーマットトリックを組み合わせたものですコンマで数千の数字を印刷する方法セパレータ？。

バイト

print ('{:,.0f}'.format(os.path.getsize(filepath))+" B")

キロビット

print ('{:,.0f}'.format(os.path.getsize(filepath)/float(1<<7))+" kb")

キロバイト

print ('{:,.0f}'.format(os.path.getsize(filepath)/float(1<<10))+" KB")

メガビット

print ('{:,.0f}'.format(os.path.getsize(filepath)/float(1<<17))+" mb")

メガバイト

print ('{:,.0f}'.format(os.path.getsize(filepath)/float(1<<20))+" MB")

ギガビット

print ('{:,.0f}'.format(os.path.getsize(filepath)/float(1<<27))+" gb")

ギガバイト

print ('{:,.0f}'.format(os.path.getsize(filepath)/float(1<<30))+" GB")

テラバイト

print ('{:,.0f}'.format(os.path.getsize(filepath)/float(1<<40))+" TB")

明らかに、彼らはあなたが最初にどのサイズを扱うかを大体知っていると仮定します。私の場合（South West London TVのビデオ編集者）はビデオクリップのMBであり、時にはGBです。

UPDATE USING PATHLIBHildyのコメントへの返信として、関数のコンパクトなペア（それらをマージするのではなく、「アトミック」に保つ）の提案を以下に示します。 Python標準ライブラリ：

from pathlib import Path def get_size(path = Path('.')): """ Gets file size, or total directory size """ if path.is_file(): size = path.stat().st_size Elif path.is_dir(): size = sum(file.stat().st_size for file in path.glob('*.*')) return size def format_size(path, unit="MB"): """ Converts integers to common size units used in computing """ bit_shift = {"B": 0, "kb": 7, "KB": 10, "mb": 17, "MB": 20, "gb": 27, "GB": 30, "TB": 40,} return "{:,.0f}".format(get_size(path) / float(1 << bit_shift[unit])) + " " + unit # Tests and test results >>> get_size("d:\media\bags of fun.avi") '38 MB' >>> get_size("d:\media\bags of fun.avi","KB") '38,763 KB' >>> get_size("d:\media\bags of fun.avi","kb") '310,104 kb'

Romeo Mihalcea · Answer

誰かがこの問題の逆を探している場合（確かにしたように）、ここに私のために働くものがあります：

def get_bytes(size, suffix): size = int(float(size)) suffix = suffix.lower() if suffix == 'kb' or suffix == 'kib': return size << 10 Elif suffix == 'mb' or suffix == 'mib': return size << 20 Elif suffix == 'gb' or suffix == 'gib': return size << 30 return False

WesternGun · Answer

ここで私の2セントは、上下にキャストを許可し、カスタマイズ可能な精度を追加します。

def convertFloatToDecimal(f=0.0, precision=2): ''' Convert a float to string of decimal. precision: by default 2. If no arg provided, return "0.00". ''' return ("%." + str(precision) + "f") % f def formatFileSize(size, sizeIn, sizeOut, precision=0): ''' Convert file size to a string representing its value in B, KB, MB and GB. The convention is based on sizeIn as original unit and sizeOut as final unit. ''' assert sizeIn.upper() in {"B", "KB", "MB", "GB"}, "sizeIn type error" assert sizeOut.upper() in {"B", "KB", "MB", "GB"}, "sizeOut type error" if sizeIn == "B": if sizeOut == "KB": return convertFloatToDecimal((size/1024.0), precision) Elif sizeOut == "MB": return convertFloatToDecimal((size/1024.0**2), precision) Elif sizeOut == "GB": return convertFloatToDecimal((size/1024.0**3), precision) Elif sizeIn == "KB": if sizeOut == "B": return convertFloatToDecimal((size*1024.0), precision) Elif sizeOut == "MB": return convertFloatToDecimal((size/1024.0), precision) Elif sizeOut == "GB": return convertFloatToDecimal((size/1024.0**2), precision) Elif sizeIn == "MB": if sizeOut == "B": return convertFloatToDecimal((size*1024.0**2), precision) Elif sizeOut == "KB": return convertFloatToDecimal((size*1024.0), precision) Elif sizeOut == "GB": return convertFloatToDecimal((size/1024.0), precision) Elif sizeIn == "GB": if sizeOut == "B": return convertFloatToDecimal((size*1024.0**3), precision) Elif sizeOut == "KB": return convertFloatToDecimal((size*1024.0**2), precision) Elif sizeOut == "MB": return convertFloatToDecimal((size*1024.0), precision)

TBなどを必要に応じて追加します。

Keith · Answer

ls -lhの出力に一致するバージョンを次に示します。

def human_size(num: int) -> str: base = 1 for unit in ['B', 'K', 'M', 'G', 'T', 'P', 'E', 'Z', 'Y']: n = num / base if n < 9.95 and unit != 'B': # Less than 10 then keep 1 decimal place value = "{:.1f}{}".format(n, unit) return value if round(n) < 1000: # Less than 4 digits so use this value = "{}{}".format(round(n), unit) return value base *= 1024 value = "{}{}".format(round(n), unit) return value

sleblanc · Answer

これが私の実装です。

_from bisect import bisect def to_filesize(bytes_num, si=True): decade = 1000 if si else 1024 partitions = Tuple(decade ** n for n in range(1, 6)) suffixes = Tuple('BKMGTP') i = bisect(partitions, bytes_num) s = suffixes[i] for n in range(i): bytes_num /= decade f = '{:.3f}'.format(bytes_num) return '{}{}'.format(f.rstrip('0').rstrip('.'), s) _

最大3桁の小数を出力し、後続のゼロとピリオドを取り除きます。ブールパラメータsiは、10ベースのサイズの使用と2ベースのサイズの大きさを切り替えます。

これはその対応物です。 {'maximum_filesize': from_filesize('10M')のようなクリーンな構成ファイルを作成できます。意図したファイルサイズに近い整数を返します。ソース値が浮動小数点数であるため、ビットシフトを使用していません（from_filesize('2.15M')は問題なく受け入れられます）。これを整数/ 10進数に変換すると機能しますが、コードはより複雑になり、既に機能します。

_def from_filesize(spec, si=True): decade = 1000 if si else 1024 suffixes = Tuple('BKMGTP') num = float(spec[:-1]) s = spec[-1] i = suffixes.index(s) for n in range(i): num *= decade return int(num) _