スタンフォードコアnlp Java出力

Question

私はJavaとStanfordNLPツールキットを使用してプロジェクトに使用しようとしている初心者です。具体的には、Stanford Corenlpツールキットを使用してテキストに注釈を付けようとしています（コマンドではなくNetbeansを使用）行）そして私は http://nlp.stanford.edu/software/corenlp.shtml#Usage （Stanford CoreNLP APIを使用）で提供されるコードを使用しようとしました。質問は：誰でも言うことができますさらに処理できるように、ファイルで出力を取得するにはどうすればよいですか？

内容を見るために、グラフと文章をコンソールに印刷してみました。それはうまくいきます。基本的に必要なのは、注釈付きのドキュメントを返すことです。これにより、メインクラスからドキュメントを呼び出して、テキストファイルを出力できます（可能な場合）。 stanford corenlpのAPIを調べようとしていますが、経験が不足しているため、このような情報を返すための最良の方法がわかりません。

コードは次のとおりです。

Properties props = new Properties(); props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref"); StanfordCoreNLP pipeline = new StanfordCoreNLP(props); // read some text in the text variable String text = "the quick fox jumps over the lazy dog"; // create an empty Annotation just with the given text Annotation document = new Annotation(text); // run all Annotators on this text pipeline.annotate(document); // these are all the sentences in this document // a CoreMap is essentially a Map that uses class objects as keys and has values with custom types List<CoreMap> sentences = document.get(SentencesAnnotation.class); for(CoreMap sentence: sentences) { // traversing the words in the current sentence // a CoreLabel is a CoreMap with additional token-specific methods for (CoreLabel token: sentence.get(TokensAnnotation.class)) { // this is the text of the token String Word = token.get(TextAnnotation.class); // this is the POS tag of the token String pos = token.get(PartOfSpeechAnnotation.class); // this is the NER label of the token String ne = token.get(NamedEntityTagAnnotation.class); } // this is the parse tree of the current sentence Tree tree = sentence.get(TreeAnnotation.class); // this is the Stanford dependency graph of the current sentence SemanticGraph dependencies = sentence.get(CollapsedCCProcessedDependenciesAnnotation.class); } // This is the coreference link graph // Each chain stores a set of mentions that link to each other, // along with a method for getting the most representative mention // Both sentence and token offsets start at 1! Map<Integer, CorefChain> graph = document.get(CorefChainAnnotation.class);

Christopher Manning · Accepted Answer

コード例に示されている自然言語分析の一部またはすべてを取得したら、通常のJava形式で、たとえばテキスト用のFileWriterを使用してファイルに送信するだけです。出力をフォーマットします。具体的には、ファイルに送信される出力を示す簡単な完全な例を次に示します（適切なコマンドライン引数を指定した場合）。

import Java.io.*; import Java.util.*; import edu.stanford.nlp.io.*; import edu.stanford.nlp.ling.*; import edu.stanford.nlp.pipeline.*; import edu.stanford.nlp.trees.*; import edu.stanford.nlp.util.*; public class StanfordCoreNlpDemo { public static void main(String[] args) throws IOException { PrintWriter out; if (args.length > 1) { out = new PrintWriter(args[1]); } else { out = new PrintWriter(System.out); } PrintWriter xmlOut = null; if (args.length > 2) { xmlOut = new PrintWriter(args[2]); } StanfordCoreNLP pipeline = new StanfordCoreNLP(); Annotation annotation; if (args.length > 0) { annotation = new Annotation(IOUtils.slurpFileNoExceptions(args[0])); } else { annotation = new Annotation("Kosgi Santosh sent an email to Stanford University. He didn't get a reply."); } pipeline.annotate(annotation); pipeline.prettyPrint(annotation, out); if (xmlOut != null) { pipeline.xmlPrint(annotation, xmlOut); } // An Annotation is a Map and you can get and use the various analyses individually. // For instance, this gets the parse tree of the first sentence in the text. List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class); if (sentences != null && sentences.size() > 0) { CoreMap sentence = sentences.get(0); Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class); out.println(); out.println("The first sentence parsed is:"); tree.pennPrint(out); } } }