TensorFlow / Kerasマルチスレッドモデルフィッティング

Question

複数のスレッド（およびkerasバックエンド）を使用して、異なるパラメーター値で複数のtensorflowモデルをトレーニングしようとしています。複数のスレッド内で同じモデルを使用するいくつかの例を見てきましたが、この特定のケースでは、競合するグラフなどに関するさまざまなエラーに遭遇します。

_from concurrent.futures import ThreadPoolExecutor import numpy as np import tensorflow as tf from keras import backend as K from keras.layers import Dense from keras.models import Sequential sess = tf.Session() def example_model(size): model = Sequential() model.add(Dense(size, input_shape=(5,))) model.add(Dense(1)) model.compile(optimizer='sgd', loss='mse') return model if __name__ == '__main__': K.set_session(sess) X = np.random.random((10, 5)) y = np.random.random((10, 1)) models = [example_model(i) for i in range(5, 10)] e = ThreadPoolExecutor(4) res_list = [e.submit(model.fit, X, y) for model in models] for res in res_list: print(res.result()) _

結果のエラーはValueError: Tensor("Variable:0", shape=(5, 5), dtype=float32_ref) must be from the same graph as Tensor("Variable_2/read:0", shape=(), dtype=float32).です。また、スレッド内のモデルを初期化してみましたが、同様のエラーが発生しました。

これについて最善の方法についての考えはありますか？私はこの正確な構造にはまったく興味がありませんが、すべてのモデルが同じGPUメモリ割り当て内でトレーニングされるように、プロセスではなく複数のスレッドを使用できるようにしたいと思います。

dkamm · Accepted Answer

Tensorflowグラフはスレッドセーフではありません（ https://www.tensorflow.org/api_docs/python/tf/Graph を参照）。新しいTensorflowセッションを作成すると、デフォルトでデフォルトのグラフが使用されます。

これを回避するには、並列化された関数で新しいグラフを使用して新しいセッションを作成し、そこにkerasモデルを構築します。

以下は、利用可能な各gpuでモデルを並行して作成および適合させるコードです。

import concurrent.futures import numpy as np import keras.backend as K from keras.layers import Dense from keras.models import Sequential import tensorflow as tf from tensorflow.python.client import device_lib def get_available_gpus(): local_device_protos = device_lib.list_local_devices() return [x.name for x in local_device_protos if x.device_type == 'GPU'] xdata = np.random.randn(100, 8) ytrue = np.random.randint(0, 2, 100) def fit(gpu): with tf.Session(graph=tf.Graph()) as sess: K.set_session(sess) with tf.device(gpu): model = Sequential() model.add(Dense(12, input_dim=8, activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam') model.fit(xdata, ytrue, verbose=0) return model.evaluate(xdata, ytrue, verbose=0) gpus = get_available_gpus() with concurrent.futures.ThreadPoolExecutor(len(gpus)) as executor: results = [x for x in executor.map(fit, gpus)] print('results: ', results)