如何修复 LDA 模型一致性分数运行时错误？

Question

文本='Alice is a student.She likes studying.Teachers are giving a lot of homewok.'

我正在尝试从具有连贯性的简单文本（如上）获取主题 score.This 是我的 LDA 模型：

id2word = corpora.Dictionary(data_lemmatized)
texts = data_lemmatized
corpus = [id2word.doc2bow(text) for text in texts]

lda_model = gensim.models.ldamodel.LdaModel(corpus=corpus,
                                           id2word=id2word,
                                           num_topics=5, 
                                           random_state=100,
                                           update_every=1,
                                           chunksize=100,
                                           passes=10,
                                           alpha='auto',
                                           per_word_topics=True)
# Print the Keyword in the 10 topics
pprint(lda_model.print_topics())
doc_lda = lda_model[corpus]

当我尝试运行这个一致性模型时：

coherence_model_lda = CoherenceModel(model=lda_model, texts=data_lemmatized, dictionary=id2word, 
coherence='c_v')
coherence_lda = coherence_model_lda.get_coherence()
print('\nCoherence Score: ', coherence_lda)

我应该得到这个输出之王->一致性分数：0.532947587081

我收到这个错误：引发 RuntimeError(''' 运行时错误：已尝试在当前进程已完成引导阶段。

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

我应该怎么做才能解决这个问题？

Answer 1

我遇到过同样的问题。在 if__name__==main" 中添加 'Coherence Model' 解决了我的问题。

if __name__ == "__main__":

     coherence_model_lda = CoherenceModel(model=lda_model, texts=data_lemmatized, 
                                                          dictionary=id2word, 
                                                              coherence='c_v')
     coherence_lda = coherence_model_lda.get_coherence()
     print('\nCoherence Score: ', coherence_lda)

Answer 2

我在运行 gensim Nmf 时遇到了同样的问题，解决方法是将 coherence='c_v' 更改为 coherence='u_mass'

Answer 3

您可以毫无问题地使用 coherence='c_v'。我的回答与 AKHILA 非常相似。但是我在主进程中调用 freeze_support() 并在支持 Windows.

的情况下启动该方法

从头考虑结构：

# imports
from multiprocessing import Process, freeze_support
import ...

# general constants and variables
...

# functions definition
def ...
...

def ...
...

# main function
def principal(): # can be another name
...
...

if __name__ == '__main__':
  freeze_support()
  Process(target=main).start()

如何修复 LDA 模型一致性分数运行时错误？

How to fix LDA model coherence score runtime Error?

python

nlp

runtime-error

lda

topic-modeling

我正在尝试从具有连贯性的简单文本（如上）获取主题 score.This 是我的 LDA 模型：

当我尝试运行这个一致性模型时：

我应该得到这个输出之王->一致性分数：0.532947587081

如何修复 LDA 模型一致性分数运行时错误？

How to fix LDA model coherence score runtime Error?

python

nlp

runtime-error

lda

topic-modeling

我正在尝试从具有连贯性的简单文本（如上）获取主题 score.This 是我的 LDA 模型：

当我尝试 运行 这个一致性模型时：

我应该得到这个输出之王->一致性分数：0.532947587081

当我尝试运行这个一致性模型时：