提取 displacy (spacy) 输出依赖关系

Question

我正在使用 spacy 的位移可视化工具来查看句子中单词之间的依赖关系。它看起来像这样：

text = 'European authorities fined Google a record .1 billion on Wednesday for abusing its power in the mobile phone market and ordered the company to alter its practices
print(displacy.render(nlp(text), jupyter=True, style='ent'))
print(displacy.render(nlp(text), style='dep', jupyter = True, options = {'distance': 120}))

有没有办法通过索引字符串中的单词来提取箭头所建立的联系？例如，在下图中，查看 'European Authorities fined Google' 中的连接。无论如何要制作以下数据框（单词列中的每个单词，以及连接列中单词连接到的每个单词）？：

word       |   connection
---------------------------
European   |   
Authorities| European
fined      | Authorities, Google, record, ..., ...
Google     |

Answer 1

空间 provides a lot of attributes that you can use for this purpose like ancestors or children。请注意，这些属性 return 生成器因此需要将它们转换为列表，然后是字符串

这是我使用 children 属性的示例

text = 'European authorities fined Google a record .1 billion on Wednesday for abusing its power in the mobile phone market and ordered the company to alter its practices'
doc = nlp(text)
words = []
a_network = []
for w in doc:
  words.append(w)
  network = [t.text for t in list(w.children)]
  a_network.append(", ".join(network))

df = pd.DataFrame({"word":words,"network":a_network})

print(df)

输出将是

           word                               network
0      European                                      
1   authorities                              European
2         fined  authorities, Google, record, on, for
3        Google                                      
4             a                                      
5        record                            a, billion
6             $                                      
7           5.1                                      
8       billion                                $, 5.1
...

提取 displacy (spacy) 输出依赖关系

Extracting displacy (spacy) output depenedency connections

python

nlp

pandas

spacy