Flaskで非同期タスクを作成する

Question

Flaskでアプリケーションを作成していますが、WSGIが同期してブロックしていることを除いて、非常にうまく機能します。特に、サードパーティのAPIを呼び出すタスクが1つあり、そのタスクを完了するには数分かかる場合があります。その呼び出し（実際には一連の呼び出し）を実行し、実行させたいと思います。制御はFlaskに戻ります。

私の見解は次のようになります：

@app.route('/render/<id>', methods=['POST']) def render_script(id=None): ... data = json.loads(request.data) text_list = data.get('text_list') final_file = audio_class.render_audio(data=text_list) # do stuff return Response( mimetype='application/json', status=200 )

さて、私がしたいのは

final_file = audio_class.render_audio()

Flaskがリクエストの処理を続行できる間、メソッドが戻るときに実行されるコールバックを実行して提供します。これは、非同期で実行するためにFlaskが必要とする唯一のタスクです。これを実装する最善の方法についてのアドバイスをお願いします。

私はTwistedとKleinを見てきましたが、多分スレッディングで十分なので、それらが過剰であるかどうかはわかりません。それとも、セロリはこれに適していますか？

Connie · Accepted Answer

Celery を使用して、非同期タスクを処理します。タスクキューとして機能するブローカーをインストールする必要があります（RabbitMQとRedisが推奨されます）。

app.py：

from flask import Flask from celery import Celery broker_url = 'amqp://guest@localhost' # Broker URL for RabbitMQ task queue app = Flask(__name__) celery = Celery(app.name, broker=broker_url) celery.config_from_object('celeryconfig') # Your celery configurations in a celeryconfig.py @celery.task(bind=True) def some_long_task(self, x, y): # Do some long task ... @app.route('/render/<id>', methods=['POST']) def render_script(id=None): ... data = json.loads(request.data) text_list = data.get('text_list') final_file = audio_class.render_audio(data=text_list) some_long_task.delay(x, y) # Call your async task and pass whatever necessary variables return Response( mimetype='application/json', status=200 )

Flaskアプリを実行し、別のプロセスを開始してセロリワーカーを実行します。

$ celery worker -A app.celery --loglevel=debug

また、Meluel Gringbergの write up を参照して、FlaskでCeleryを使用するための詳細なガイドを参照してください。

Jurgen Strydom · Answer

スレッド化は別の可能な解決策です。 Celeryベースのソリューションは大規模なアプリケーションに適していますが、問題のエンドポイントでトラフィックが多くなりすぎない場合は、スレッド化が実行可能な代替手段です。

このソリューションは、 Miguel GrinbergのPyCon 2016 Flask at Scaleプレゼンテーション、具体的にはスライド41 に基づいています。彼のコードはgithubでも入手可能です元のソースに興味がある人のために。

ユーザーの観点から、コードは次のように機能します。

長時間実行されるタスクを実行するエンドポイントを呼び出します。
このエンドポイントは、タスクのステータスを確認するためのリンクとともに202 Acceptedを返します。
ステータスリンクの呼び出しは、タスクの実行中に202を返し、タスクが完了すると200（および結果）を返します。

API呼び出しをバックグラウンドタスクに変換するには、@ async_apiデコレーターを追加するだけです。

完全に含まれる例を次に示します。

from flask import Flask, g, abort, current_app, request, url_for from werkzeug.exceptions import HTTPException, InternalServerError from flask_restful import Resource, Api from datetime import datetime from functools import wraps import threading import time import uuid tasks = {} app = Flask(__name__) api = Api(app) @app.before_first_request def before_first_request(): """Start a background thread that cleans up old tasks.""" def clean_old_tasks(): """ This function cleans up old tasks from our in-memory data structure. """ global tasks while True: # Only keep tasks that are running or that finished less than 5 # minutes ago. five_min_ago = datetime.timestamp(datetime.utcnow()) - 5 * 60 tasks = {task_id: task for task_id, task in tasks.items() if 'completion_timestamp' not in task or task['completion_timestamp'] > five_min_ago} time.sleep(60) if not current_app.config['TESTING']: thread = threading.Thread(target=clean_old_tasks) thread.start() def async_api(wrapped_function): @wraps(wrapped_function) def new_function(*args, **kwargs): def task_call(flask_app, environ): # Create a request context similar to that of the original request # so that the task can have access to flask.g, flask.request, etc. with flask_app.request_context(environ): try: tasks[task_id]['return_value'] = wrapped_function(*args, **kwargs) except HTTPException as e: tasks[task_id]['return_value'] = current_app.handle_http_exception(e) except Exception as e: # The function raised an exception, so we set a 500 error tasks[task_id]['return_value'] = InternalServerError() if current_app.debug: # We want to find out if something happened so reraise raise finally: # We record the time of the response, to help in garbage # collecting old tasks tasks[task_id]['completion_timestamp'] = datetime.timestamp(datetime.utcnow()) # close the database session (if any) # Assign an id to the asynchronous task task_id = uuid.uuid4().hex # Record the task, and then launch it tasks[task_id] = {'task_thread': threading.Thread( target=task_call, args=(current_app._get_current_object(), request.environ))} tasks[task_id]['task_thread'].start() # Return a 202 response, with a link that the client can use to # obtain task status print(url_for('gettaskstatus', task_id=task_id)) return 'accepted', 202, {'Location': url_for('gettaskstatus', task_id=task_id)} return new_function class GetTaskStatus(Resource): def get(self, task_id): """ Return status about an asynchronous task. If this request returns a 202 status code, it means that task hasn't finished yet. Else, the response from the task is returned. """ task = tasks.get(task_id) if task is None: abort(404) if 'return_value' not in task: return '', 202, {'Location': url_for('gettaskstatus', task_id=task_id)} return task['return_value'] class CatchAll(Resource): @async_api def get(self, path=''): # perform some intensive processing print("starting processing task, path: '%s'" % path) time.sleep(10) print("completed processing task, path: '%s'" % path) return f'The answer is: {path}' api.add_resource(CatchAll, '/<path:path>', '/') api.add_resource(GetTaskStatus, '/status/<task_id>') if __== '__main__': app.run(debug=True)