spacy 训练数据中的 heads 是什么意思？

Question

我正在尝试根据自己的数据训练模型，并且我正在使用 Spacy 库。

但我对代码示例中的“# index of token head”感到困惑。

这里的 heads 到底是什么意思？

# training data: texts, heads and dependency labels
# for no relation, we simply chose an arbitrary dependency label, e.g. '-'
TRAIN_DATA = [
    (
        "find a cafe with great wifi",
        {
            "heads": [0, 2, 0, 5, 5, 2],  # index of token head
            "deps": ["ROOT", "-", "PLACE", "-", "QUALITY", "ATTRIBUTE"],
        },
    )

Answer 1

在你的例子中，任务是重建一个tree of syntactic dependencies. This tree shows for each word the corresponding "head" word to which it is attached and the type of attachment. One particular format in which such trees are described is called CoNLL-U。

在你的例子中，例如"great"（第 4 个字，如果我们从 0 开始计数）附加到 "wifi"（第 5 个字），"great" 是质量 "wifi"。因此，heads 的第 4 个条目等于 5，deps 的第 4 个条目等于 "QUALITY".

spacy 训练数据中的 heads 是什么意思？

what is the meaning of heads in spacy training data?

python

json

nlp

nltk

spacy