InvalidAlgumentError：2ルートエラーが見つかりました。 Tensorflowテキスト分類モデルにおける互換性のない形状

Question

私は次のものからコードを取り入れようとしています Repo これはこれに基づいています紙。それはたくさんのエラーを持っていましたが、私は主にそれを働いていました。しかし、私は同じ問題を踏み入れ続けています、そして私は本当にこの問題を解決する方法を理解していません。

ステートメントの基準が満たされている場合、検証が2回目になったときにエラーが発生します。初めては常に機能し、次に2番目の停止します。私はそれが役立つかどうかを壊す前に印刷する出力を含めています。以下のエラーを参照してください。

step = 1, train_loss = 1204.7784423828125, train_accuracy = 0.13725490868091583 counter = 1, dev_loss = 1188.6639287274584, dev_accuacy = 0.2814199453625912 step = 2, train_loss = 1000.983154296875, train_accuracy = 0.26249998807907104 --------------------------------------------------------------------------- InvalidArgumentError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py in _do_call(self, fn, *args) 1364 try: -> 1365 return fn(*args) 1366 except errors.OpError as e: 7 frames InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Incompatible shapes: [2,185] vs. [2,229] [[{{node loss/cond/add_1}}]] [[viterbi_decode/cond/rnn_1/while/Switch_3/_541]] (1) Invalid argument: Incompatible shapes: [2,185] vs. [2,229] [[{{node loss/cond/add_1}}]] 0 successful operations. 0 derived errors ignored. During handling of the above exception, another exception occurred: InvalidArgumentError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py in _do_call(self, fn, *args) 1382 '
session_config.graph_options.rewrite_options.' 1383 'disable_meta_optimizer = True') -> 1384 raise type(e)(node_def, op, message) 1385 1386 def _extend_graph(self): InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Incompatible shapes: [2,185] vs. [2,229] [[node loss/cond/add_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]] [[viterbi_decode/cond/rnn_1/while/Switch_3/_541]] (1) Invalid argument: Incompatible shapes: [2,185] vs. [2,229] [[node loss/cond/add_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]] 0 successful operations. 0 derived errors ignored. Original stack trace for 'loss/cond/add_1': File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py", line 16, in <module> app.launch_new_instance() File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 664, in launch_instance app.start() File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelapp.py", line 477, in start ioloop.IOLoop.instance().start() File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 888, in start handler_func(fd_obj, events) File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events self._handle_recv() File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv self._run_callback(callback, msg) File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback callback(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher return self.dispatch_Shell(stream, msg) File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 235, in dispatch_Shell handler(stream, idents, msg) File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request user_expressions, allow_stdin) File "/usr/local/lib/python3.6/dist-packages/ipykernel/ipkernel.py", line 196, in do_execute res = Shell.run_cell(code, store_history=store_history, silent=silent) File "/usr/local/lib/python3.6/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell interactivity=interactivity, compiler=compiler, result=result) File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2822, in run_ast_nodes if self.run_code(code, result): File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-11-90859dc83f76>", line 66, in <module> main() File "<ipython-input-11-90859dc83f76>", line 12, in main model = DAModel() File "<ipython-input-9-682db36e2a23>", line 148, in __init__ self.logits, self.labels, self.dialogue_lengths) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/crf/python/ops/crf.py", line 257, in crf_log_likelihood transition_params) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/crf/python/ops/crf.py", line 116, in crf_sequence_score false_fn=_multi_seq_fn) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/layers/utils.py", line 202, in smart_cond pred, true_fn=true_fn, false_fn=false_fn, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/smart_cond.py", line 59, in smart_cond name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/control_flow_ops.py", line 1235, in cond orig_res_f, res_f = context_f.BuildCondBranch(false_fn) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/control_flow_ops.py", line 1061, in BuildCondBranch original_result = fn() File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/crf/python/ops/crf.py", line 104, in _multi_seq_fn unary_scores = crf_unary_score(tag_indices, sequence_lengths, inputs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/crf/python/ops/crf.py", line 287, in crf_unary_score flattened_tag_indices = array_ops.reshape(offsets + tag_indices, [-1]) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/math_ops.py", line 899, in binary_op_wrapper return func(x, y, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/math_ops.py", line 1197, in _add_dispatch return gen_math_ops.add_v2(x, y, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_math_ops.py", line 549, in add_v2 "AddV2", x=x, y=y, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op attrs, op_def, compute_device) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__ self._traceback = tf_stack.extract_stack() _

これがコードです（Repoとは少し異なります。

バージョン：Python 3）

tensorflow == 1.15.0

pandas == 0.25.3

numpy == 1.17.5

import glob import pandas as pd import tensorflow as tf import pandas as pd import numpy as np # preprocess data file_list = [] for f in glob.glob('swda/*'): file_list.append(f) df_list = [] for i in file_list: df = pd.read_csv(i) df_list.append(df) text_list = [] label_list = [] for df in df_list: df['utterance_no_specialchar_'] = df.utterance_no_specialchar.astype(str) text = df.utterance_no_specialchar_.tolist() labels = df.da_category.tolist() text_list.append(text) label_list.append(labels) ### new preprocessing step text_list = [[[j] for j in i] for i in text_list] tok_data = [y[0] for x in text_list for y in x] tokenizer = tf.keras.preprocessing.text.Tokenizer() tokenizer.fit_on_texts(tok_data) sequences = [] for x in text_list: tmp = [] for y in x: tmp.append(tokenizer.texts_to_sequences(y)[0]) sequences.append(tmp) def _pad_sequences(sequences, pad_tok, max_length): """ Args: sequences: a generator of list or Tuple pad_tok: the char to pad with Returns: a list of list where each sublist has same length """ sequence_padded, sequence_length = [], [] for seq in sequences: seq = list(seq) seq_ = seq[:max_length] + [pad_tok]*max(max_length - len(seq), 0) sequence_padded += [seq_] sequence_length += [min(len(seq), max_length)] return sequence_padded, sequence_length def pad_sequences(sequences, pad_tok, nlevels=1): """ Args: sequences: a generator of list or Tuple pad_tok: the char to pad with nlevels: "depth" of padding, for the case where we have characters ids Returns: a list of list where each sublist has same length """ if nlevels == 1: max_length = max(map(lambda x : len(x), sequences)) sequence_padded, sequence_length = _pad_sequences(sequences, pad_tok, max_length) Elif nlevels == 2: max_length_Word = max([max(map(lambda x: len(x), seq)) for seq in sequences]) sequence_padded, sequence_length = [], [] for seq in sequences: # all words are same length now sp, sl = _pad_sequences(seq, pad_tok, max_length_Word) sequence_padded += [sp] sequence_length += [sl] max_length_sentence = max(map(lambda x : len(x), sequences)) sequence_padded, _ = _pad_sequences(sequence_padded, [pad_tok]*max_length_Word, max_length_sentence) sequence_length, _ = _pad_sequences(sequence_length, 0, max_length_sentence) return sequence_padded, sequence_length def minibatches(data, labels, batch_size): data_size = len(data) start_index = 0 num_batches_per_Epoch = int((len(data) + batch_size - 1) / batch_size) for batch_num in range(num_batches_per_Epoch): start_index = batch_num * batch_size end_index = min((batch_num + 1) * batch_size, data_size) yield data[start_index: end_index], labels[start_index: end_index] def select(parameters, length): """Select the last valid time step output as the sentence embedding :params parameters: [batch, seq_len, hidden_dims] :params length: [batch] :Returns : [batch, hidden_dims] """ shape = tf.shape(parameters) idx = tf.range(shape[0]) idx = tf.stack([idx, length - 1], axis = 1) return tf.gather_nd(parameters, idx) class DAModel(): def __init__(self): with tf.variable_scope("placeholder"): self.dialogue_lengths = tf.placeholder(tf.int32, shape = [None], name = "dialogue_lengths") self.Word_ids = tf.placeholder(tf.int32, shape = [None,None,None], name = "Word_ids") self.utterance_lengths = tf.placeholder(tf.int32, shape = [None, None], name = "utterance_lengths") self.labels = tf.placeholder(tf.int32, shape = [None, None], name = "labels") self.clip = tf.placeholder(tf.float32, shape = [], name = 'clip') ######################## EMBEDDINGS ########################################### with tf.variable_scope("embeddings"): _Word_embeddings = tf.get_variable( name = "_Word_embeddings", dtype = tf.float32, shape = [words, Word_dim], initializer = tf.random_uniform_initializer() ) Word_embeddings = tf.nn.embedding_lookup(_Word_embeddings,self.Word_ids, name="Word_embeddings") self.Word_embeddings = tf.nn.dropout(Word_embeddings, 0.8) with tf.variable_scope("utterance_encoder"): s = tf.shape(self.Word_embeddings) batch_size = s[0] * s[1] time_step = s[-2] Word_embeddings = tf.reshape(self.Word_embeddings, [batch_size, time_step, Word_dim]) length = tf.reshape(self.utterance_lengths, [batch_size]) fw = tf.nn.rnn_cell.LSTMCell(hidden_size_lstm_1, forget_bias=0.8, state_is_Tuple= True) bw = tf.nn.rnn_cell.LSTMCell(hidden_size_lstm_1, forget_bias=0.8, state_is_Tuple= True) output, _ = tf.nn.bidirectional_dynamic_rnn(fw, bw, Word_embeddings,sequence_length=length, dtype = tf.float32) output = tf.concat(output, axis = -1) # [batch_size, time_step, dim] # Select the last valid time step output as the utterance embedding, # this method is more concise than TensorArray with while_loop # output = select(output, self.utterance_lengths) # [batch_size, dim] output = select(output, length) # [batch_size, dim] # output = tf.reshape(output, s[0], s[1], 2 * hidden_size_lstm_1) output = tf.reshape(output, [s[0], s[1], 2 * hidden_size_lstm_1]) output = tf.nn.dropout(output, 0.8) with tf.variable_scope("bi-lstm"): cell_fw = tf.contrib.rnn.BasicLSTMCell(hidden_size_lstm_2, state_is_Tuple = True) cell_bw = tf.contrib.rnn.BasicLSTMCell(hidden_size_lstm_2, state_is_Tuple = True) (output_fw, output_bw), _ = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, output, sequence_length = self.dialogue_lengths, dtype = tf.float32) outputs = tf.concat([output_fw, output_bw], axis = -1) outputs = tf.nn.dropout(outputs, 0.8) with tf.variable_scope("proj1"): output = tf.reshape(outputs, [-1, 2 * hidden_size_lstm_2]) W = tf.get_variable("W", dtype = tf.float32, shape = [2 * hidden_size_lstm_2, proj1], initializer= tf.contrib.layers.xavier_initializer()) b = tf.get_variable("b", dtype = tf.float32, shape = [proj1], initializer=tf.zeros_initializer()) output = tf.nn.relu(tf.matmul(output, W) + b) with tf.variable_scope("proj2"): W = tf.get_variable("W", dtype = tf.float32, shape = [proj1, proj2], initializer= tf.contrib.layers.xavier_initializer()) b = tf.get_variable("b", dtype = tf.float32, shape = [proj2], initializer=tf.zeros_initializer()) output = tf.nn.relu(tf.matmul(output, W) + b) with tf.variable_scope("logits"): nstep = tf.shape(outputs)[1] W = tf.get_variable("W", dtype = tf.float32,shape=[proj2, tags], initializer = tf.random_uniform_initializer()) b = tf.get_variable("b", dtype = tf.float32,shape = [tags],initializer=tf.zeros_initializer()) pred = tf.matmul(output, W) + b self.logits = tf.reshape(pred, [-1, nstep, tags]) with tf.variable_scope("loss"): log_likelihood, self.trans_params = tf.contrib.crf.crf_log_likelihood( self.logits, self.labels, self.dialogue_lengths) self.loss = tf.reduce_mean(-log_likelihood) + tf.nn.l2_loss(W) + tf.nn.l2_loss(b) #tf.summary.scalar("loss", self.loss) with tf.variable_scope("viterbi_decode"): viterbi_sequence, _ = tf.contrib.crf.crf_decode(self.logits, self.trans_params, self.dialogue_lengths) batch_size = tf.shape(self.dialogue_lengths)[0] output_ta = tf.TensorArray(dtype = tf.float32, size = 1, dynamic_size = True) def body(time, output_ta_1): length = self.dialogue_lengths[time] vcode = viterbi_sequence[time][:length] true_labs = self.labels[time][:length] accurate = tf.reduce_sum(tf.cast(tf.equal(vcode, true_labs), tf.float32)) output_ta_1 = output_ta_1.write(time, accurate) return time + 1, output_ta_1 def condition(time, output_ta_1): return time < batch_size i = 0 [time, output_ta] = tf.while_loop(condition, body, loop_vars = [i, output_ta]) output_ta = output_ta.stack() accuracy = tf.reduce_sum(output_ta) self.accuracy = accuracy / tf.reduce_sum(tf.cast(self.dialogue_lengths, tf.float32)) #tf.summary.scalar("accuracy", self.accuracy) with tf.variable_scope("train_op"): optimizer = tf.train.AdagradOptimizer(0.1) #if tf.greater(self.clip , 0): grads, vs = Zip(*optimizer.compute_gradients(self.loss)) grads, gnorm = tf.clip_by_global_norm(grads, self.clip) self.train_op = optimizer.apply_gradients(Zip(grads, vs)) #else: # self.train_op = optimizer.minimize(self.loss) #self.merged = tf.summary.merge_all() ### Set model variables hidden_size_lstm_1 = 200 hidden_size_lstm_2 = 200 tags = 39 # assuming number of classes to predict? Word_dim = 300 proj1 = 200 proj2 = 100 words = 20001 # words = 8759 + 1 # max(num_unique_Word_tokens) batchSize = 2 log_dir = "train" model_dir = "DAModel" model_name = "ckpt" ### Run model def main(): # tokenize and vectorize text data to prepare for embedding train_data = sequences[:75] train_labels = label_list[:75] dev_data = sequences[75:] dev_labels = label_list[75:] config = tf.ConfigProto() config.gpu_options.per_process_gpu_memory_fraction = 0.4 with tf.Session(config = config) as sess: model = DAModel() sess.run(tf.global_variables_initializer()) clip = 2 saver = tf.train.Saver() #writer = tf.summary.FileWriter("D:\Experimemts\tensorflow\DA\train", sess.graph) writer = tf.summary.FileWriter("train", sess.graph) counter = 0 for Epoch in range(10): for dialogues, labels in minibatches(train_data, train_labels, batchSize): _, dialogue_lengthss = pad_sequences(dialogues, 0) Word_idss, utterance_lengthss = pad_sequences(dialogues, 0, nlevels = 2) true_labs = labels labs_t, _ = pad_sequences(true_labs, 0) counter += 1 train_loss, train_accuracy, _ = sess.run([model.loss, model.accuracy,model.train_op], feed_dict = {model.Word_ids: Word_idss, model.utterance_lengths: utterance_lengthss, model.dialogue_lengths: dialogue_lengthss, model.labels:labs_t, model.clip :clip} ) #writer.add_summary(summary, global_step = counter) print("step = {}, train_loss = {}, train_accuracy = {}".format(counter, train_loss, train_accuracy)) train_precision_summ = tf.Summary() train_precision_summ.value.add( tag='train_accuracy', simple_value=train_accuracy) writer.add_summary(train_precision_summ, counter) train_loss_summ = tf.Summary() train_loss_summ.value.add( tag='train_loss', simple_value=train_loss) writer.add_summary(train_loss_summ, counter) if counter % 1 == 0: loss_dev = [] acc_dev = [] for dev_dialogues, dev_labels in minibatches(dev_data, dev_labels, batchSize): _, dialogue_lengthss = pad_sequences(dev_dialogues, 0) Word_idss, utterance_lengthss = pad_sequences(dev_dialogues, 0, nlevels = 2) true_labs = dev_labels labs_t, _ = pad_sequences(true_labs, 0) dev_loss, dev_accuacy = sess.run([model.loss, model.accuracy], feed_dict = {model.Word_ids: Word_idss, model.utterance_lengths: utterance_lengthss, model.dialogue_lengths: dialogue_lengthss, model.labels:labs_t}) loss_dev.append(dev_loss) acc_dev.append(dev_accuacy) valid_loss = sum(loss_dev) / len(loss_dev) valid_accuracy = sum(acc_dev) / len(acc_dev) dev_precision_summ = tf.Summary() dev_precision_summ.value.add( tag='dev_accuracy', simple_value=valid_accuracy) writer.add_summary(dev_precision_summ, counter) dev_loss_summ = tf.Summary() dev_loss_summ.value.add( tag='dev_loss', simple_value=valid_loss) writer.add_summary(dev_loss_summ, counter) print("counter = {}, dev_loss = {}, dev_accuacy = {}".format(counter, valid_loss, valid_accuracy)) if __name__ == "__main__": tf.reset_default_graph() main() _

データはここでから来て、次のようになります。

[[['what '], ['do you want to start '], ['f uh laughter you hit you hit f uh '], ['it doesnt matter '], ['f um were discussing the capital punishment i believe '], ['right '], ['you are right '], ['yeah '], [' i i suppose i should have '], ['f uh which '], ['i am am pro capital punishment except that i dont like the way its done '], ['uhhuh '], ['f uh yeah '], ['f uh i f uh i guess i i hate to see anyone die f uh '] ... ]] _

モデルを訓練するデータセットはここで見つけることができます。 https：//github.com/cmeaton/hierarchical_bilstm-crf_encoder/tree/master/swda_parsed

私はこのエラーが何らかのエラーでさえ、それを理解する方法を理解しています。あらゆるアドバイスは大切になります。ありがとう。

EliadL · Answer

エラーに集中しましょう。

Invalid argument: Incompatible shapes: [2,185] vs. [2,229]

その形状は互換性がないため、2つのテンソル間の操作が失敗するという問題があるようです。

選択したtensorflowバージョンが、作成者によって使用されているものよりも許可が少ない可能性があります。

この問題、著者は彼がtensorflow==1.8を使用したと推測します。

だから私はあなたがこの以前のバージョンを使用しようとすることをお勧めします（1.7,1.9,1.10など）。

また、以前のバージョンでは、kerasパッケージが今日のように統合されていない可能性があるため、特定のkerasバージョンも使用することができます。

たとえば, この問題、keras==2.2.2にダウングレードするのに役立つものは何ですか。

それが役に立たないならば、たぶん、これらのうちの1つは: 1 2 、、 4 、 - 5 、 6