如何使用 Word2Vec 获取单个单词的单个向量？

Question

我正在尝试解决深度学习文本分类问题，因此我必须使用 Word2Vec 对文本输入进行矢量化，以将其输入神经网络。

所以我下载了一个 Google 预训练的 Word2Vec 模型：https://github.com/3Top/word2vec-api

并使用 gensim 加载它：

import gensim
model = gensim.models.KeyedVectors.load_word2vec_format('Word2Vec.bin', binary=True)

当我尝试打印特定单词时：

print(model['cat'])
# => expected output: 0.47385435 (or something)
# => actual output: array with hundreds of floats between -1 and 1

为什么我不只得到一个词的一个向量？这不是重点吗？

奖励问题：我可以将 Google 预训练 Word2Vec 模型中的 3M 词向量加载到 MongoDB 数据库中吗？（列：id - word（string） - vector（float））。因为从 .bin 或 .txt 文件加载模型需要一分钟多的时间。

Answer 1

When I try to print a specific word:

print(model['cat'])
# => expected output: 0.47385435 (or something)
# => actual output: array with hundreds of floats between -1 and 1
Why don't I just get one vector for one word? Isn't that the point?

"array with hundreds of floats between -1 and 1"是一个词向量。

当你想调用一个向量时，为什么你期望一个 scala (0.47385435)？

你需要阅读这个：https://www.tensorflow.org/tutorials/word2vec

如何使用 Word2Vec 获取单个单词的单个向量？

How do I get a single vector for a single word using Word2Vec?

python

word2vec

deep-learning