Pythonでメモリ使用量をプロファイルするにはどうすればよいですか？

Question

私は最近、アルゴリズムに興味を持ち、素朴な実装を作成し、さまざまな方法で最適化することでアルゴリズムの調査を開始しました。

ランタイムのプロファイリング用の標準のPythonモジュールには既に精通しています（ほとんどの場合、IPythonのtimeitマジック関数で十分であることがわかりました）が、メモリ使用量にも興味があります。これらのトレードオフも検討します（たとえば、以前に計算された値のテーブルをキャッシュするコストと、必要に応じて再計算するコスト）。特定の関数のメモリ使用量をプロファイルするモジュールはありますか？

Hubert · Accepted Answer

これはすでにここで回答されています： Python memory profiler

基本的にあなたはそのようなことをします（ Guppy-PE から引用）：

>>> from guppy import hpy; h=hpy() >>> h.heap() Partition of a set of 48477 objects. Total size = 3265516 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 25773 53 1612820 49 1612820 49 str 1 11699 24 483960 15 2096780 64 Tuple 2 174 0 241584 7 2338364 72 dict of module 3 3478 7 222592 7 2560956 78 types.CodeType 4 3296 7 184576 6 2745532 84 function 5 401 1 175112 5 2920644 89 dict of class 6 108 0 81888 3 3002532 92 dict (no owner) 7 114 0 79632 2 3082164 94 dict of type 8 117 0 51336 2 3133500 96 type 9 667 1 24012 1 3157512 97 __builtin__.wrapper_descriptor <76 more rows. Type e.g. '_.more' to view.> >>> h.iso(1,[],{}) Partition of a set of 3 objects. Total size = 176 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1 33 136 77 136 77 dict (no owner) 1 1 33 28 16 164 93 list 2 1 33 12 7 176 100 int >>> x=[] >>> h.iso(x).sp 0: h.Root.i0_modules['__main__'].__dict__['x'] >>>

Don Kirkby · Answer

Python 3.4には、新しいモジュール tracemalloc が含まれています。どのコードが最も多くのメモリを割り当てているかについての詳細な統計を提供します。メモリを割り当てている上位3行を表示する例を次に示します。

from collections import Counter import linecache import os import tracemalloc def display_top(snapshot, key_type='lineno', limit=3): snapshot = snapshot.filter_traces(( tracemalloc.Filter(False, "<frozen importlib._bootstrap>"), tracemalloc.Filter(False, "<unknown>"), )) top_stats = snapshot.statistics(key_type) print("Top %s lines" % limit) for index, stat in enumerate(top_stats[:limit], 1): frame = stat.traceback[0] # replace "/path/to/module/file.py" with "module/file.py" filename = os.sep.join(frame.filename.split(os.sep)[-2:]) print("#%s: %s:%s: %.1f KiB" % (index, filename, frame.lineno, stat.size / 1024)) line = linecache.getline(frame.filename, frame.lineno).strip() if line: print(' %s' % line) other = top_stats[limit:] if other: size = sum(stat.size for stat in other) print("%s other: %.1f KiB" % (len(other), size / 1024)) total = sum(stat.size for stat in top_stats) print("Total allocated size: %.1f KiB" % (total / 1024)) tracemalloc.start() counts = Counter() fname = '/usr/share/dict/american-english' with open(fname) as words: words = list(words) for Word in words: prefix = Word[:3] counts[prefix] += 1 print('Top prefixes:', counts.most_common(3)) snapshot = tracemalloc.take_snapshot() display_top(snapshot)

結果は次のとおりです。

Top prefixes: [('con', 1220), ('dis', 1002), ('pro', 809)] Top 3 lines #1: scratches/memory_test.py:37: 6527.1 KiB words = list(words) #2: scratches/memory_test.py:39: 247.7 KiB prefix = Word[:3] #3: scratches/memory_test.py:40: 193.0 KiB counts[prefix] += 1 4 other: 4.3 KiB Total allocated size: 6972.1 KiB

メモリリークがリークではないのはいつですか？

この例は、計算の最後にメモリがまだ保持されている場合に適していますが、大量のメモリを割り当ててすべて解放するコードがある場合があります。技術的にはメモリリークではありませんが、必要以上に多くのメモリを使用しています。すべてが解放されたときにメモリ使用量を追跡するにはどうすればよいですか？それがあなたのコードであれば、おそらく実行中にスナップショットを取るためにいくつかのデバッグコードを追加できます。そうでない場合は、バックグラウンドスレッドを開始して、メインスレッドの実行中にメモリ使用量を監視できます。

以下は、コードがすべてcount_prefixes()関数に移動された前の例です。その関数が戻ると、すべてのメモリが解放されます。また、長時間実行される計算をシミュレートするためにsleep()呼び出しをいくつか追加しました。

from collections import Counter import linecache import os import tracemalloc from time import sleep def count_prefixes(): sleep(2) # Start up time. counts = Counter() fname = '/usr/share/dict/american-english' with open(fname) as words: words = list(words) for Word in words: prefix = Word[:3] counts[prefix] += 1 sleep(0.0001) most_common = counts.most_common(3) sleep(3) # Shut down time. return most_common def main(): tracemalloc.start() most_common = count_prefixes() print('Top prefixes:', most_common) snapshot = tracemalloc.take_snapshot() display_top(snapshot) def display_top(snapshot, key_type='lineno', limit=3): snapshot = snapshot.filter_traces(( tracemalloc.Filter(False, "<frozen importlib._bootstrap>"), tracemalloc.Filter(False, "<unknown>"), )) top_stats = snapshot.statistics(key_type) print("Top %s lines" % limit) for index, stat in enumerate(top_stats[:limit], 1): frame = stat.traceback[0] # replace "/path/to/module/file.py" with "module/file.py" filename = os.sep.join(frame.filename.split(os.sep)[-2:]) print("#%s: %s:%s: %.1f KiB" % (index, filename, frame.lineno, stat.size / 1024)) line = linecache.getline(frame.filename, frame.lineno).strip() if line: print(' %s' % line) other = top_stats[limit:] if other: size = sum(stat.size for stat in other) print("%s other: %.1f KiB" % (len(other), size / 1024)) total = sum(stat.size for stat in top_stats) print("Total allocated size: %.1f KiB" % (total / 1024)) main()

そのバージョンを実行すると、メモリ使用量は6MBから4KBに減少しました。これは、関数が終了時にすべてのメモリを解放したためです。

Top prefixes: [('con', 1220), ('dis', 1002), ('pro', 809)] Top 3 lines #1: collections/__init__.py:537: 0.7 KiB self.update(*args, **kwds) #2: collections/__init__.py:555: 0.6 KiB return _heapq.nlargest(n, self.items(), key=_itemgetter(1)) #3: python3.6/heapq.py:569: 0.5 KiB result = [(key(elem), i, elem) for i, elem in Zip(range(0, -n, -1), it)] 10 other: 2.2 KiB Total allocated size: 4.0 KiB

次に、 another answer に触発されたバージョンを示します。これは、メモリ使用量を監視するために2番目のスレッドを開始します。

from collections import Counter import linecache import os import tracemalloc from datetime import datetime from queue import Queue, Empty from resource import getrusage, RUSAGE_SELF from threading import Thread from time import sleep def memory_monitor(command_queue: Queue, poll_interval=1): tracemalloc.start() old_max = 0 snapshot = None while True: try: command_queue.get(timeout=poll_interval) if snapshot is not None: print(datetime.now()) display_top(snapshot) return except Empty: max_rss = getrusage(RUSAGE_SELF).ru_maxrss if max_rss > old_max: old_max = max_rss snapshot = tracemalloc.take_snapshot() print(datetime.now(), 'max RSS', max_rss) def count_prefixes(): sleep(2) # Start up time. counts = Counter() fname = '/usr/share/dict/american-english' with open(fname) as words: words = list(words) for Word in words: prefix = Word[:3] counts[prefix] += 1 sleep(0.0001) most_common = counts.most_common(3) sleep(3) # Shut down time. return most_common def main(): queue = Queue() poll_interval = 0.1 monitor_thread = Thread(target=memory_monitor, args=(queue, poll_interval)) monitor_thread.start() try: most_common = count_prefixes() print('Top prefixes:', most_common) finally: queue.put('stop') monitor_thread.join() def display_top(snapshot, key_type='lineno', limit=3): snapshot = snapshot.filter_traces(( tracemalloc.Filter(False, "<frozen importlib._bootstrap>"), tracemalloc.Filter(False, "<unknown>"), )) top_stats = snapshot.statistics(key_type) print("Top %s lines" % limit) for index, stat in enumerate(top_stats[:limit], 1): frame = stat.traceback[0] # replace "/path/to/module/file.py" with "module/file.py" filename = os.sep.join(frame.filename.split(os.sep)[-2:]) print("#%s: %s:%s: %.1f KiB" % (index, filename, frame.lineno, stat.size / 1024)) line = linecache.getline(frame.filename, frame.lineno).strip() if line: print(' %s' % line) other = top_stats[limit:] if other: size = sum(stat.size for stat in other) print("%s other: %.1f KiB" % (len(other), size / 1024)) total = sum(stat.size for stat in top_stats) print("Total allocated size: %.1f KiB" % (total / 1024)) main()

resourceモジュールを使用すると、現在のメモリ使用量を確認し、ピークメモリ使用量からスナップショットを保存できます。このキューにより、メインスレッドはレポートを印刷してシャットダウンするタイミングをメモリモニタスレッドに通知できます。実行すると、list()呼び出しで使用されているメモリが表示されます。

2018-05-29 10:34:34.441334 max RSS 10188 2018-05-29 10:34:36.475707 max RSS 23588 2018-05-29 10:34:36.616524 max RSS 38104 2018-05-29 10:34:36.772978 max RSS 45924 2018-05-29 10:34:36.929688 max RSS 46824 2018-05-29 10:34:37.087554 max RSS 46852 Top prefixes: [('con', 1220), ('dis', 1002), ('pro', 809)] 2018-05-29 10:34:56.281262 Top 3 lines #1: scratches/scratch.py:36: 6527.0 KiB words = list(words) #2: scratches/scratch.py:38: 16.4 KiB prefix = Word[:3] #3: scratches/scratch.py:39: 10.1 KiB counts[prefix] += 1 19 other: 10.8 KiB Total allocated size: 6564.3 KiB

Linuxを使用している場合は、resourceモジュールよりも /proc/self/statm の方が便利です。

anon · Answer

本当に簡単なアプローチの場合：

import resource def using(point=""): usage=resource.getrusage(resource.RUSAGE_SELF) return '''%s: usertime=%s systime=%s mem=%s mb '''%(point,usage[0],usage[1], (usage[2]*resource.getpagesize())/1000000.0 )

何が起こっているかを見たい場所にusing("Label")を挿入するだけです。

serv-inc · Answer

オブジェクトのメモリ使用量のみを確認する場合は、（ answer to other question ）

asizeofモジュールを含む Pympler というモジュールがあります。

次のように使用します。
from pympler import asizeof asizeof.asizeof(my_object) 
sys.getsizeofとは異なり、自分で作成したオブジェクトに対して機能します。
>>> asizeof.asizeof(Tuple('bcd')) 200 >>> asizeof.asizeof({'foo': 'bar', 'baz': 'bar'}) 400 >>> asizeof.asizeof({}) 280 >>> asizeof.asizeof({'foo':'bar'}) 360 >>> asizeof.asizeof('foo') 40 >>> asizeof.asizeof(Bar()) 352 >>> asizeof.asizeof(Bar().__dict__) 280 

>>> help(asizeof.asizeof) Help on function asizeof in module pympler.asizeof: asizeof(*objs, **opts) Return the combined size in bytes of all objects passed as positional arguments.

robguinness · Answer

私の意見では、受け入れられた答えと次に高い投票された答えにはいくつかの問題があるので、Ihor B.の答えに密接に基づいたもう1つの答えを提供します。

このソリューションでは、関数呼び出しをprofile関数でラップして呼び出すことにより、eitherでプロファイリングを実行できます @profileデコレーターを使用して関数/メソッドを修飾します。

最初の手法は、ソースをいじらずにいくつかのサードパーティコードをプロファイリングする場合に役立ちます。一方、2番目の手法は、少し「クリーン」で、関数/メソッドのソースを変更することを気にしない場合により良く機能しますプロファイルしたい。

RSS、VMS、および共有メモリを取得できるように、出力も変更しました。「前」と「後」の値についてはあまり気にせず、デルタのみにしたので、それらを削除しました（Ihor B.の答えと比較する場合）。

プロファイリングコード

# profile.py import time import os import psutil import inspect def elapsed_since(start): #return time.strftime("%H:%M:%S", time.gmtime(time.time() - start)) elapsed = time.time() - start if elapsed < 1: return str(round(elapsed*1000,2)) + "ms" if elapsed < 60: return str(round(elapsed, 2)) + "s" if elapsed < 3600: return str(round(elapsed/60, 2)) + "min" else: return str(round(elapsed / 3600, 2)) + "hrs" def get_process_memory(): process = psutil.Process(os.getpid()) mi = process.memory_info() return mi.rss, mi.vms, mi.shared def format_bytes(bytes): if abs(bytes) < 1000: return str(bytes)+"B" Elif abs(bytes) < 1e6: return str(round(bytes/1e3,2)) + "kB" Elif abs(bytes) < 1e9: return str(round(bytes / 1e6, 2)) + "MB" else: return str(round(bytes / 1e9, 2)) + "GB" def profile(func, *args, **kwargs): def wrapper(*args, **kwargs): rss_before, vms_before, shared_before = get_process_memory() start = time.time() result = func(*args, **kwargs) elapsed_time = elapsed_since(start) rss_after, vms_after, shared_after = get_process_memory() print("Profiling: {:>20} RSS: {:>8} | VMS: {:>8} | SHR {" ":>8} | time: {:>8}" .format("<" + func.__+ ">", format_bytes(rss_after - rss_before), format_bytes(vms_after - vms_before), format_bytes(shared_after - shared_before), elapsed_time)) return result if inspect.isfunction(func): return wrapper Elif inspect.ismethod(func): return wrapper(*args,**kwargs)

上記のコードが`profile.py`として保存されていると仮定した場合の使用例：

from profile import profile from time import sleep from sklearn import datasets # Just an example of 3rd party function call # Method 1 run_profiling = profile(datasets.load_digits) data = run_profiling() # Method 2 @profile def my_function(): # do some stuff a_list = [] for i in range(1,100000): a_list.append(i) return a_list res = my_function()

これにより、次のような出力が得られます。

Profiling: <load_digits> RSS: 5.07MB | VMS: 4.91MB | SHR 73.73kB | time: 89.99ms Profiling: <my_function> RSS: 1.06MB | VMS: 1.35MB | SHR 0B | time: 8.43ms

重要な最後の注意事項：

このプロファイリングの方法はおおよそのものに過ぎないことに注意してください。他の多くのことがマシン上で行われている可能性があるからです。ガベージコレクションおよびその他の要因により、デルタはゼロになる場合もあります。
何らかの不明な理由により、メモリ使用量がゼロの非常に短い関数呼び出し（1または2ミリ秒）が表示されます。これは、メモリ統計が更新される頻度に関するハードウェア/ OS（Linuxを搭載した基本的なラップトップでテスト済み）の制限だと思われます。
例を簡単にするために、関数の引数は使用しませんでしたが、期待どおりに動作するはずです。つまり、profile(my_function, arg)はmy_function(arg)をプロファイルします

madjardi · Answer

多分それは役立ちます：
< 追加を参照 >

pip install gprof2dot Sudo apt-get install graphviz gprof2dot -f pstats profile_for_func1_001 | dot -Tpng -o profile.png def profileit(name): """ @profileit("profile_for_func1_001") """ def inner(func): def wrapper(*args, **kwargs): prof = cProfile.Profile() retval = prof.runcall(func, *args, **kwargs) # Note use of name from outer scope prof.dump_stats(name) return retval return wrapper return inner @profileit("profile_for_func1_001") def func1(...)

Ihor B. · Answer

以下は、関数呼び出しの前、関数呼び出しの後にプロセスがどれだけのメモリを消費したか、そして何が違うのかを追跡できる単純な関数デコレータです：

import time import os import psutil def elapsed_since(start): return time.strftime("%H:%M:%S", time.gmtime(time.time() - start)) def get_process_memory(): process = psutil.Process(os.getpid()) return process.get_memory_info().rss def profile(func): def wrapper(*args, **kwargs): mem_before = get_process_memory() start = time.time() result = func(*args, **kwargs) elapsed_time = elapsed_since(start) mem_after = get_process_memory() print("{}: memory before: {:,}, after: {:,}, consumed: {:,}; exec time: {}".format( func.__name__, mem_before, mem_after, mem_after - mem_before, elapsed_time)) return result return wrapper

これが私のブログですこれはすべての詳細を説明しています。（アーカイブされたリンク）

Pythonでメモリ使用量をプロファイルするにはどうすればよいですか？

メモリリークがリークではないのはいつですか？

プロファイリングコード

上記のコードがprofile.pyとして保存されていると仮定した場合の使用例：

重要な最後の注意事項：

上記のコードが`profile.py`として保存されていると仮定した場合の使用例：