出力が同じ次元を持たない調整可能な重みで2つの損失を組み合わせたkeras

Question

私の質問はここで提起されたものと似ています： 2つの損失と調整可能な重みを組み合わせたケラス

ただし、出力の次元が異なるため、出力を連結できません。したがって、解決策は適用できませんが、この問題を解決する別の方法はありますか？

質問：

出力x1とx2の2つのレイヤーを持つkeras関数モデルがあります。

x1 = Dense(1,activation='relu')(prev_inp1) x2 = Dense(2,activation='relu')(prev_inp2)

これらのx1とx2を使用する必要があります。添付の画像のように、加重損失関数で使用します。「同じ損失」を両方のブランチに伝播します。 Alphaは、反復によって変化する柔軟性があります。

Daniel M&#246;ller · Answer

この質問には、より複雑な解決策が必要です。トレーニング可能なウェイトを使用するため、カスタムレイヤーが必要になります。

また、私たちの損失はy_trueとy_predだけを取り、2つの異なる出力を結合することを検討している他の人のようには機能しないため、別の形式のトレーニングが必要になります。

したがって、同じモデルの2つのバージョンを作成します。1つは予測用、もう1つはトレーニング用です。トレーニングバージョンには、コンパイル時にダミーのkeras損失関数を使用して、損失自体が含まれます。

予測モデル

2つの出力と1つの入力を持つモデルの非常に基本的な例を使用してみましょう。

#any input your true model takes inp = Input((5,5,2)) #represents the localization output outImg = Conv2D(1,3,activation='sigmoid')(inp) #represents the classification output outClass = Flatten()(inp) outClass = Dense(2,activation='sigmoid')(outClass) #the model predictionModel = Model(inp, [outImg,outClass])

これは定期的に予測に使用します。これをコンパイルする必要はありません。

各支店の損失

それでは、ブランチごとにカスタム損失関数を作成しましょう。1つはLossCls用、もう1つはLossLoc用です。

ここでダミーの例を使用すると、必要に応じてこれらの損失をより詳しく説明できます。最も重要なのは、それらが（batch、1）または（batch、）のような形のバッチを出力することです。どちらも同じ形状を出力するため、後で合計できます。

def calcImgLoss(x): true,pred = x loss = binary_crossentropy(true,pred) return K.mean(loss, axis=[1,2]) def calcClassLoss(x): true,pred = x return binary_crossentropy(true,pred)

これらは、トレーニングモデルのLambdaレイヤーで使用されます。

損失加重層

それでは、トレーニング可能なアルファで損失を重み付けしましょう。トレーニング可能なパラメーターを実装するには、カスタムレイヤーが必要です。

class LossWeighter(Layer): def __init__(self, **kwargs): #kwargs can have 'name' and other things super(LossWeighter, self).__init__(**kwargs) #create the trainable weight here, notice the constraint between 0 and 1 def build(self, inputShape): self.weight = self.add_weight(name='loss_weight', shape=(1,), initializer=Constant(0.5), constraint=Between(0,1), trainable=True) super(LossWeighter,self).build(inputShape) def call(self,inputs): firstLoss, secondLoss = inputs return (self.weight * firstLoss) + ((1-self.weight)*secondLoss) def compute_output_shape(self,inputShape): return inputShape[0]

この重みを0から1の間に保つためのカスタム制約があることに注意してください。この制約は次のように実装されます。

class Between(Constraint): def __init__(self,min_value,max_value): self.min_value = min_value self.max_value = max_value def __call__(self,w): return K.clip(w,self.min_value, self.max_value) def get_config(self): return {'min_value': self.min_value, 'max_value': self.max_value}

トレーニングモデル

このモデルは、予測モデルをベースとして、最後に損失計算と損失加重を追加し、損失値のみを出力します。損失のみを出力するため、入力として真のターゲットを使用し、次のように定義されたダミーの損失関数を使用します。

def ignoreLoss(true,pred): return pred #this just tries to minimize the prediction without any extra computation

モデル入力：

#true targets trueImg = Input((3,3,1)) trueClass = Input((2,)) #predictions from the prediction model predImg = predictionModel.outputs[0] predClass = predictionModel.outputs[1]

モデル出力=損失：

imageLoss = Lambda(calcImgLoss, name='loss_loc')([trueImg, predImg]) classLoss = Lambda(calcClassLoss, name='loss_cls')([trueClass, predClass]) weightedLoss = LossWeighter(name='weighted_loss')([imageLoss,classLoss])

モデル：

trainingModel = Model([predictionModel.input, trueImg, trueClass], weightedLoss) trainingModel.compile(optimizer='sgd', loss=ignoreLoss)

ダミートレーニング

inputImages = np.zeros((7,5,5,2)) outputImages = np.ones((7,3,3,1)) outputClasses = np.ones((7,2)) dummyOut = np.zeros((7,)) trainingModel.fit([inputImages,outputImages,outputClasses], dummyOut, epochs = 50) predictionModel.predict(inputImages)

必要な輸入

from keras.layers import * from keras.models import Model from keras.constraints import Constraint from keras.initializers import Constant from keras.losses import binary_crossentropy #or another you need

rvinas · Answer

出力を連結する必要はありません。損失関数に複数の引数を渡すには、次のようにラップします。

def custom_loss(x1, x2, y1, y2, alpha): def loss(y_true, y_pred): return (1-alpha) * loss_cls(y1, x1) + alpha * loss_loc(y2, x2) return loss

次に、機能モデルを次のようにコンパイルします。

x1 = Dense(1, activation='relu')(prev_inp1) x2 = Dense(2, activation='relu')(prev_inp2) y1 = Input((1,)) y2 = Input((2,)) model.compile('sgd', loss=custom_loss(x1, x2, y1, y2, 0.5), target_tensors=[y1, y2])

注：テストされていません。