亲和传播没有收敛,这个模型不会有任何聚类中心

Affinity propagation did not converge, this model will not have any cluster centers

当我尝试使用亲和力传播进行聚类时,出现以下错误并且聚类数为一个。

"...\anaconda\lib\site-packages\sklearn\cluster\_affinity_propagation.py:246: ConvergenceWarning: Affinity propagation did not converge, this model will not have any cluster centers.
  warnings.warn("Affinity propagation did not converge, this model ""

下面是我试过的代码。

def build_feature_matrix(documents, feature_type='frequency',
                         ngram_range=(1, 1), min_df=0.0, max_df=1.0):

    feature_type = feature_type.lower().strip()  
    
    if feature_type == 'binary':
        vectorizer = CountVectorizer(binary=True, min_df=min_df,
                                     max_df=max_df, ngram_range=ngram_range)
    elif feature_type == 'frequency':
        vectorizer = CountVectorizer(binary=False, min_df=min_df,
                                     max_df=max_df, ngram_range=ngram_range)
    elif feature_type == 'tfidf':
        vectorizer = TfidfVectorizer(min_df=min_df, max_df=max_df, 
                                     ngram_range=ngram_range)
    else:
        raise Exception("Wrong feature type entered. Possible values: 'binary', 'frequency', 'tfidf'")

    feature_matrix = vectorizer.fit_transform(documents).astype(float)
    
    return vectorizer, feature_matrix

vectorizer, feature_matrix = build_feature_matrix(filtered_list_6,
                                                  feature_type='tfidf',
                                                  min_df=0.15, max_df=0.85,
                                                  ngram_range=(1, 2))

def affinity_propagation(feature_matrix):
    
    sim = feature_matrix * feature_matrix.T
    sim = sim.todense()
    ap = AffinityPropagation()
    ap.fit(sim)
    clusters = ap.labels_          
    return ap, clusters

ap_obj, clusters = affinity_propagation(feature_matrix=feature_matrix)
df[len(df.columns)] = clusters

c = Counter(clusters)   
print(c.items())

total_clusters = len(c)
print('Total Clusters:', total_clusters)

有人可以指出我在这里做错了什么吗?

提前致谢!

我可以更改阻尼值、max_iter 和偏好值来消除该问题。最初你可以从 damping = 0.9, max_iter = 1000.

开始

您可以根据需要更改偏好值,这将更改模型生成的聚类数