Pythonで何かを返す関数をマルチプロセスすることは可能ですか？

Question

Pythonマルチプロセッシングが呼び出される多くの例を見てきましたが、ターゲットは何かを出力するだけです。ターゲットが2つの変数を返すシナリオがあります。後で使用する必要があります。

def foo(some args): a = someObject b = someObject return a,b p1=multiprocess(target=foo,args(some args)) p2=multiprocess(target=foo,args(some args)) p3=multiprocess(target=foo,args(some args))

それで？ .startと.joinを実行できますが、個々の結果を取得するにはどうすればよいですか？実行するすべてのジョブのリターンa、bをキャッチして、それに取り組む必要があります。

Eli Bendersky · Accepted Answer

はい、確かに-いくつかの方法を使用できます。最も簡単なものの1つは、共有Queueです。こちらの例をご覧ください： http://eli.thegreenplace.net/2012/01/16/python-parallelizing-cpu-bound-tasks-with-multiprocessing/

Mike McKerns · Answer

複数のプロセスを使用して、恥ずかしいほどの並列作業を実行しようとしています...では、Poolを使用しないのはなぜですか？ Poolは、プロセスの起動、結果の取得、結果の返送を処理します。ここでは、pathosのフォークを持つmultiprocessingを使用します。これは、標準ライブラリが提供するバージョンよりもはるかに優れたシリアル化を備えているためです。

>>> from pathos.multiprocessing import ProcessingPool as Pool >>> >>> def foo(obj1, obj2): ... a = obj1.x**2 ... b = obj2.x**2 ... return a,b ... >>> class Bar(object): ... def __init__(self, x): ... self.x = x ... >>> res = Pool().map(foo, [Bar(1),Bar(2),Bar(3)], [Bar(4),Bar(5),Bar(6)]) >>> res [(1, 16), (4, 25), (9, 36)]

fooは2つの引数を取り、2つのオブジェクトのTupleを返すことがわかります。 mapのPoolメソッドは、基になるプロセスにfooを送信し、結果をresとして返します。

ここでpathosを取得できます： https://github.com/uqfoundation

cha0site · Answer

私はあなたにそれへの直接リンクを与えることができないので、この例をドキュメントから直接コピーしています。 done_queueからの結果を出力することに注意してくださいが、それを使って好きなことを行うことができます。

# # Simple example which uses a pool of workers to carry out some tasks. # # Notice that the results will probably not come out of the output # queue in the same in the same order as the corresponding tasks were # put on the input queue. If it is important to get the results back # in the original order then consider using `Pool.map()` or # `Pool.imap()` (which will save on the amount of code needed anyway). # # Copyright (c) 2006-2008, R Oudkerk # All rights reserved. # import time import random from multiprocessing import Process, Queue, current_process, freeze_support # # Function run by worker processes # def worker(input, output): for func, args in iter(input.get, 'STOP'): result = calculate(func, args) output.put(result) # # Function used to calculate result # def calculate(func, args): result = func(*args) return '%s says that %s%s = %s' % \ (current_process().name, func.__name__, args, result) # # Functions referenced by tasks # def mul(a, b): time.sleep(0.5*random.random()) return a * b def plus(a, b): time.sleep(0.5*random.random()) return a + b # # # def test(): NUMBER_OF_PROCESSES = 4 TASKS1 = [(mul, (i, 7)) for i in range(20)] TASKS2 = [(plus, (i, 8)) for i in range(10)] # Create queues task_queue = Queue() done_queue = Queue() # Submit tasks for task in TASKS1: task_queue.put(task) # Start worker processes for i in range(NUMBER_OF_PROCESSES): Process(target=worker, args=(task_queue, done_queue)).start() # Get and print results print 'Unordered results:' for i in range(len(TASKS1)): print '	', done_queue.get() # Add more tasks using `put()` for task in TASKS2: task_queue.put(task) # Get and print some more results for i in range(len(TASKS2)): print '	', done_queue.get() # Tell child processes to stop for i in range(NUMBER_OF_PROCESSES): task_queue.put('STOP') if __name__ == '__main__': freeze_support() test()

もともとは multiprocessing module docs からのものです。

Kanan Rahimov · Answer

誰もmultiprocessing.Poolのcallbackを使用しないのはなぜですか？

例：

from multiprocessing import Pool from contextlib import contextmanager from pprint import pprint from requests import get as get_page @contextmanager def _terminating(thing): try: yield thing finally: thing.terminate() def _callback(*args, **kwargs): print("CALBACK") pprint(args) pprint(kwargs) print("Processing...") with _terminating(Pool(processes=WORKERS)) as pool: results = pool.map_async(get_page, URLS, callback=_callback) start_time = time.time() results.wait() end_time = time.time() print("Time for Processing: %ssecs" % (end_time - start_time))

ここでは、引数とkwargsの両方を出力します。ただし、callbackを次のように置き換えることができます。

def _callback2(responses): for r in responses: print(r.status_code) # or do whatever with response...

Jakob Bowyer · Answer

Windowsでは動作しませんが、ここに関数用のマルチプロセッシングデコレータがあります。返されるデータをポーリングして収集できるキューを返します

import os from Queue import Queue from multiprocessing import Process def returning_wrapper(func, *args, **kwargs): queue = kwargs.get("multiprocess_returnable") del kwargs["multiprocess_returnable"] queue.put(func(*args, **kwargs)) class Multiprocess(object): """Cute decorator to run a function in multiple processes.""" def __init__(self, func): self.func = func self.processes = [] def __call__(self, *args, **kwargs): num_processes = kwargs.get("multiprocess_num_processes", 2) # default to two processes. return_obj = kwargs.get("multiprocess_returnable", Queue()) # default to stdlib Queue kwargs["multiprocess_returnable"] = return_obj for i in xrange(num_processes): pro = Process(target=returning_wrapper, args=Tuple([self.func] + list(args)), kwargs=kwargs) self.processes.append(pro) pro.start() return return_obj @Multiprocess def info(): print 'module name:', __name__ print 'parent process:', os.getppid() print 'process id:', os.getpid() return 4 * 22 data = info() print data.get(False)

Farsheed · Answer

巨大ファイルのマルチプロセス検索の例を次に示します。