如何创建一个以 word2ve 向量作为数据、术语作为行标签的 DataFrame?
How to create a DataFrame with the word2ve vectors as data, and the terms as row labels?
我试着按照这个文档:
nbviewer.jupyter.org/github/skipgram/modern-nlp-in-python/blob/master/executable/Modern_NLP_in_Python.ipynb
我有以下代码片段的地方:
ordered_vocab = [(term, voc.index, voc.count)
for term, voc in food2vec.vocab.iteritems()]
ordered_vocab = sorted(ordered_vocab, key=lambda (term, index, count): -count)
ordered_terms, term_indices, term_counts = zip(*ordered_vocab)
word_vectors = pd.DataFrame(food2vec.syn0norm[term_indices, :],
index=ordered_terms
为了达到 运行,我已将其更改为以下内容:
ordered_vocab = [(term, voc.index, voc.count)
for term, voc in word2vecda.wv.vocab.items()]
ordered_vocab = sorted(ordered_vocab)
ordered_terms, term_indices, term_counts = zip(*ordered_vocab)
word_vectorsda = pd.DataFrame(word2vecda.wv.syn0norm[term_indices,],index=ordered_terms)
word_vectorsda [:20]
但是在我打印 DataFrame 之前的最后一行给我一个错误,我无法理解。它保持return noneType 对象不能在此行中。对我来说,它看起来像是 Term_indices 在那里跟踪它,但我不明白为什么?
TypeError: 'NoneType' object is not subscriptable
有什么可以帮助我的吗?欢迎任何意见
最佳尼尔斯
使用以下代码:
ordered_vocab = [(term, voc.index, voc.count) for term, voc in model.wv.vocab.items()]
ordered_vocab = sorted(ordered_vocab, key=lambda k: k[2])
ordered_terms, term_indices, term_counts = zip(*ordered_vocab)
word_vectors = pd.DataFrame(model.wv.syn0[term_indices, :], index=ordered_terms)
将model
替换为food2vec
。
致力于 python 3.6.1
、gensim '3.0.0'
我试着按照这个文档: nbviewer.jupyter.org/github/skipgram/modern-nlp-in-python/blob/master/executable/Modern_NLP_in_Python.ipynb 我有以下代码片段的地方:
ordered_vocab = [(term, voc.index, voc.count)
for term, voc in food2vec.vocab.iteritems()]
ordered_vocab = sorted(ordered_vocab, key=lambda (term, index, count): -count)
ordered_terms, term_indices, term_counts = zip(*ordered_vocab)
word_vectors = pd.DataFrame(food2vec.syn0norm[term_indices, :],
index=ordered_terms
为了达到 运行,我已将其更改为以下内容:
ordered_vocab = [(term, voc.index, voc.count)
for term, voc in word2vecda.wv.vocab.items()]
ordered_vocab = sorted(ordered_vocab)
ordered_terms, term_indices, term_counts = zip(*ordered_vocab)
word_vectorsda = pd.DataFrame(word2vecda.wv.syn0norm[term_indices,],index=ordered_terms)
word_vectorsda [:20]
但是在我打印 DataFrame 之前的最后一行给我一个错误,我无法理解。它保持return noneType 对象不能在此行中。对我来说,它看起来像是 Term_indices 在那里跟踪它,但我不明白为什么?
TypeError: 'NoneType' object is not subscriptable
有什么可以帮助我的吗?欢迎任何意见 最佳尼尔斯
使用以下代码:
ordered_vocab = [(term, voc.index, voc.count) for term, voc in model.wv.vocab.items()]
ordered_vocab = sorted(ordered_vocab, key=lambda k: k[2])
ordered_terms, term_indices, term_counts = zip(*ordered_vocab)
word_vectors = pd.DataFrame(model.wv.syn0[term_indices, :], index=ordered_terms)
将model
替换为food2vec
。
致力于 python 3.6.1
、gensim '3.0.0'