"Merged" 附属国和选区树

Question

我正在进行一项研究，其中我使用 CoreNlp 来解析句子，使用各种可用的注释器（主要是选区和情绪）。

我现在正在尝试创建一个 "merged" 树，其中包括选区和依赖信息，我将从中提取语法（可以考虑 PCFG）。

我正在尝试找到类似于图片中左侧树的位置：

（图片来自Relational-Realizational Parsing (Tsarfaty and Sima’an, 2008)）

是否有一些 "easy" 方法可以使用提供的解析器输出（在代码中）来达到类似的效果？

或者，您是否知道基于斯坦福 NLP 库的任何实现？

GrammaticalStructure在这里有帮助吗？为每个节点制作一个 GS 并在每个选区节点读取它的 typedDependencies() 在这里有意义吗？

Answer 1

这里是一些代码的草图，用于处理选区和依赖项解析。

Properties props = new Properties();
props.setProperty("annotators","tokenize, ssplit, pos, lemma, parse");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
String sampleDoc = "This is the first sentence.  This is the second one.";
Annotation annotation = new Annotation(sampleDoc);
pipeline.annotate(annotation);
List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
for (CoreMap sentence : sentences) {
  Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
  SemanticGraph deps = sentence.get(SemanticGraphCoreAnnotations.BasicDependenciesAnnotation.class);
}

对于您的项目，您将需要研究这两个 classes：

http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/Tree.html

http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/SemanticGraph.html

树 class 代表选区解析，语义图 class 代表依赖解析。