categorical_crossentropyはkerasにどのように実装されていますか？

Question

私は蒸留の概念を適用しようとしています。基本的には、元のネットワークと同じように、ただし計算量を減らして、新しい小さなネットワークをトレーニングするためです。

ロジットではなく、すべてのサンプルのソフトマックス出力があります。

私の質問は、カテゴリカルクロスエントロピー損失関数はどのように実装されているのですか？元のラベルの最大値を取得し、同じインデックス内の対応する予測値と乗算するか、式が示すようにロジット全体で合計を行います（One Hotエンコーディング）。

$enter image description here$

Nassim Ben · Accepted Answer

テンソルフロータグを使用しているようですが、これが使用しているバックエンドだと思いますか？

def categorical_crossentropy(output, target, from_logits=False): """Categorical crossentropy between an output tensor and a target tensor. # Arguments output: A tensor resulting from a softmax (unless `from_logits` is True, in which case `output` is expected to be the logits). target: A tensor of the same shape as `output`. from_logits: Boolean, whether `output` is the result of a softmax, or is a tensor of logits. # Returns Output tensor.

このコードは kerasソースコードから来ています。コードを直接見ると、すべての質問に答えられるはずです:)さらに情報が必要な場合は、質問してください。

編集：

これがあなたの興味のあるコードです：

 # Note: tf.nn.softmax_cross_entropy_with_logits # expects logits, Keras expects probabilities. if not from_logits: # scale preds so that the class probas of each sample sum to 1 output /= tf.reduce_sum(output, reduction_indices=len(output.get_shape()) - 1, keep_dims=True) # manual computation of crossentropy epsilon = _to_tensor(_EPSILON, output.dtype.base_dtype) output = tf.clip_by_value(output, epsilon, 1. - epsilon) return - tf.reduce_sum(target * tf.log(output), reduction_indices=len(output.get_shape()) - 1)

あなたがリターンを見ると、彼らはそれを合計します... :)

dat09 · Answer

「イプシロンと_tf.clip_by_value_が何をしているのか知っていますか？」への回答として、
tf.log(0)はゼロ除算エラーを返すため、_output != 0_を保証しています。
（コメントするポイントはありませんが、貢献したいと思いました）