python asyncioとスレッドを組み合わせる方法は？

Question

Python asyncioとaiohttpを使用して RESTfulマイクロサービスを構築し、POSTイベントをリッスンしてリアルタイムのイベントを収集しました。さまざまなフィーダーから。

次に、ネストされたdefaultdict/deque構造にイベントの最後の24hをキャッシュするためにメモリ内構造を構築します。

できればピクルを使用して、その構造をディスクに定期的にチェックポイントしたいと思います。

メモリ構造は100 MBを超える可能性があるため、構造のチェックポイントに要する時間の間、着信イベント処理を保留することは避けたいと思います。

むしろ、構造のスナップショットコピー（例：ディープコピー）を作成し、それをディスクに書き込み、事前に設定された時間間隔で繰り返します。

私はスレッドを結合する方法の例を探していました（そして、スレッドはこれに最適なソリューションですか？）とその目的のためのasyncioを探していましたが、私に役立つ何かを見つけることができませんでした。

始めるための指針は大歓迎です！

dano · Accepted Answer

BaseEventLoop.run_in_executor を使用してメソッドをスレッドまたはサブプロセスに委任するのは非常に簡単です。

import asyncio import time from concurrent.futures import ProcessPoolExecutor def cpu_bound_operation(x): time.sleep(x) # This is some operation that is CPU-bound @asyncio.coroutine def main(): # Run cpu_bound_operation in the ProcessPoolExecutor # This will make your coroutine block, but won't block # the event loop; other coroutines can run in meantime. yield from loop.run_in_executor(p, cpu_bound_operation, 5) loop = asyncio.get_event_loop() p = ProcessPoolExecutor(2) # Create a ProcessPool with 2 processes loop.run_until_complete(main())

ProcessPoolExecutorを使用するかThreadPoolExecutorを使用するかについては、言うのは難しいです。大きなオブジェクトをピクルスすると、間違いなくいくつかのCPUサイクルが消費されます。最初はProcessPoolExecutorを使用すると考えます。ただし、100MBオブジェクトをプールのProcessに渡すには、メインプロセスでインスタンスをピクルし、IPCを介して子プロセスにバイトを送信し、子でピクルを外してからピクルする必要がありますagainので、ディスクに書き込むことができます。それを考えると、ピックル/アンピックルのオーバーヘッドは、GILのせいでパフォーマンスに影響を与える可能性がある場合でも、ThreadPoolExecutorを使用した方がよいほど十分に大きいと思います。

とはいえ、両方の方法をテストして確実に見つけるのは非常に簡単なので、同様に行うことができます。

enigmaticPhysicist · Answer

私もrun_in_executorを使用しましたが、この関数はキーワード引数にpartial()を必要とし、単一のエグゼキューターとデフォルトのイベント以外で呼び出すことはないため、ほとんどの状況でこの関数はかなりグロスです。ループ。そこで、実用的なデフォルトと自動キーワード引数処理を使用して、便利なラッパーを作成しました。

from time import sleep import asyncio as aio loop = aio.get_event_loop() class Executor: """In most cases, you can just use the 'execute' instance as a function, i.e. y = await execute(f, a, b, k=c) => run f(a, b, k=c) in the executor, assign result to y. The defaults can be changed, though, with your own instantiation of Executor, i.e. execute = Executor(nthreads=4)""" def __init__(self, loop=loop, nthreads=1): from concurrent.futures import ThreadPoolExecutor self._ex = ThreadPoolExecutor(nthreads) self._loop = loop def __call__(self, f, *args, **kw): from functools import partial return self._loop.run_in_executor(self._ex, partial(f, *args, **kw)) execute = Executor() ... def cpu_bound_operation(t, alpha=30): sleep(t) return 20*alpha async def main(): y = await execute(cpu_bound_operation, 5, alpha=-2) loop.run_until_complete(main())