字符串列表的逐点运算

Question

我想对字符串列表进行逐点运算。该列表包含单词（名词），可能如下所示：

lst_words = ['car', 'vehicle', 'boat', 'ship']

现在我想对这个列表进行逐点运算，并得到一个包含结果的矩阵。矩阵的大小取决于输入列表的大小。（在本例中为 4x4 值）该操作基于一个函数，该函数比较单词的相似度和 Returns 浮点数。

函数看起来像这样：

import nltk 
from nltk.corpus import wordnet 
# Compare words:
def get_synset(word_01, word_02):
    w1 = wordnet.synset(word_01 + '.n.01')
    w2 = wordnet.synset(word_02 + '.n.01') 
    return w1.wup_similarity(w2)

到目前为止，我无法在 Google 上找到解决方案，但也许有人可以帮助我解决这个问题，因为我不知道这叫做我在寻找什么。

感谢您的帮助。

Answer 1

我可能没有正确理解问题，但为什么不呢

np.array([[get_synset(x, y) for x in list_words] for y in list_words])

Answer 2

您可以使用 numpy.fromfunction 您唯一需要做的改变是更改您的功能以使用单词的索引而不是单词本身：

WORDS = ["your", "list", "of", "words"]

def get_synset_by_index(i1, i2):
    return get_synset(WORDS[i1], WORDS[i2])

matrix = numpy.fromfunction(get_synset_by_index, (len(WORDS), len(WORDS))

字符串列表的逐点运算

Pointwise Operation with a List of Strings

python

nlp

numpy

wordnet

pandas