将 CorefChain 映射到 Stanford Core NLP 中的 CoreEntityMention

Map CorefChain to CoreEntityMention in Stanford Core NLP

我想使用

构建类型为 Map<CorefChain, CoreEntityMention> 的查找映射
Map<Integer, Integer> mapping = document.annotation().get(CoreAnnotations.CorefMentionToEntityMentionMappingAnnotation.class);  

其中 documentCoreDocument

我试图获取一组 CorefChains 和一组 CoreEntityMentions 来构建这样的地图,但索引似乎不匹配。

Map<Integer, CorefChain> chains = document.corefChains();
List<CoreEntityMention> entities = document.entityMentions();   

示例:

句子:

 "ʿAmrān is a small city in western central Yemen. It is the capital of the 'Amran 
  Governorate, and was formerly in the Sana'a Governorate. It is located 52.9 
  kilometres by road northwest of the Yemeni capital of Sana'a. According to the 
  2004 census it had a population of 76,863, and an estimated population of 
  90,792 in 2012."       

连锁店:

{
    1=CHAIN1-["a small city in western central Yemen" in sentence 1, "It" in sentence 2, "It" in sentence 3], 
    2=CHAIN2-["western central Yemen" in sentence 1], 
    4=CHAIN4-["the capital of the ` Amran Governorate" in sentence 2], 
    5=CHAIN5-["the ` Amran Governorate" in sentence 2], 
    6=CHAIN6-["the Sana'a Governorate" in sentence 2, "Sana'a" in sentence 3], 
    7=CHAIN7-["52.9" in sentence 3], 
    10=CHAIN10-["52.9 kilometres" in sentence 3], 
    11=CHAIN11-["road northwest of the Yemeni capital of Sana'a" in sentence 3], 
    12=CHAIN12-["the Yemeni capital of Sana'a" in sentence 3], 13=CHAIN13-["76,863" in sentence 4], 
    14=CHAIN14-["90,792" in sentence 4], 15=CHAIN15-["the 2004 census" in sentence 4, "it" in sentence 4], 
    17=CHAIN17-["a population of 76,863 , and an estimated population of 90,792 in 2012" in sentence 4], 
    18=CHAIN18-["a population of 76,863" in sentence 4], 
    19=CHAIN19-["an estimated population of 90,792 in 2012" in sentence 4], 
    20=CHAIN20-["2012" in sentence 4]
}

实体:

[Yemen, Sana'a Governorate, 52.9, Yemeni, Sana'a, 2004, 76,863, 90,792, 2012]

映射:

{16=8, 7=2, 8=4, 13=5, 14=6, 15=7}

需要说明的是,提及有两种类型:

coref mentions
entity mentions

所有 entity mentions 应该是 coref mentions,但并非所有 coref mentions 都是 entity mentions

如您所见,有一张从 coref mentionsentity mentions 的地图。

您应该能够看到带有 coref 提及的 corefClusterID 属性 的链 ID。所以你有一个 coref mentionentity mention 的映射,你可以通过访问 coref 提及的 corefClusterIDcoref mention 转换为链 ID。