亲和传播没有收敛,这个模型不会有任何聚类中心
Affinity propagation did not converge, this model will not have any cluster centers
当我尝试使用亲和力传播进行聚类时,出现以下错误并且聚类数为一个。
"...\anaconda\lib\site-packages\sklearn\cluster\_affinity_propagation.py:246: ConvergenceWarning: Affinity propagation did not converge, this model will not have any cluster centers.
warnings.warn("Affinity propagation did not converge, this model ""
下面是我试过的代码。
def build_feature_matrix(documents, feature_type='frequency',
ngram_range=(1, 1), min_df=0.0, max_df=1.0):
feature_type = feature_type.lower().strip()
if feature_type == 'binary':
vectorizer = CountVectorizer(binary=True, min_df=min_df,
max_df=max_df, ngram_range=ngram_range)
elif feature_type == 'frequency':
vectorizer = CountVectorizer(binary=False, min_df=min_df,
max_df=max_df, ngram_range=ngram_range)
elif feature_type == 'tfidf':
vectorizer = TfidfVectorizer(min_df=min_df, max_df=max_df,
ngram_range=ngram_range)
else:
raise Exception("Wrong feature type entered. Possible values: 'binary', 'frequency', 'tfidf'")
feature_matrix = vectorizer.fit_transform(documents).astype(float)
return vectorizer, feature_matrix
vectorizer, feature_matrix = build_feature_matrix(filtered_list_6,
feature_type='tfidf',
min_df=0.15, max_df=0.85,
ngram_range=(1, 2))
def affinity_propagation(feature_matrix):
sim = feature_matrix * feature_matrix.T
sim = sim.todense()
ap = AffinityPropagation()
ap.fit(sim)
clusters = ap.labels_
return ap, clusters
ap_obj, clusters = affinity_propagation(feature_matrix=feature_matrix)
df[len(df.columns)] = clusters
c = Counter(clusters)
print(c.items())
total_clusters = len(c)
print('Total Clusters:', total_clusters)
有人可以指出我在这里做错了什么吗?
提前致谢!
我可以更改阻尼值、max_iter 和偏好值来消除该问题。最初你可以从 damping = 0.9, max_iter = 1000.
开始
您可以根据需要更改偏好值,这将更改模型生成的聚类数
当我尝试使用亲和力传播进行聚类时,出现以下错误并且聚类数为一个。
"...\anaconda\lib\site-packages\sklearn\cluster\_affinity_propagation.py:246: ConvergenceWarning: Affinity propagation did not converge, this model will not have any cluster centers.
warnings.warn("Affinity propagation did not converge, this model ""
下面是我试过的代码。
def build_feature_matrix(documents, feature_type='frequency',
ngram_range=(1, 1), min_df=0.0, max_df=1.0):
feature_type = feature_type.lower().strip()
if feature_type == 'binary':
vectorizer = CountVectorizer(binary=True, min_df=min_df,
max_df=max_df, ngram_range=ngram_range)
elif feature_type == 'frequency':
vectorizer = CountVectorizer(binary=False, min_df=min_df,
max_df=max_df, ngram_range=ngram_range)
elif feature_type == 'tfidf':
vectorizer = TfidfVectorizer(min_df=min_df, max_df=max_df,
ngram_range=ngram_range)
else:
raise Exception("Wrong feature type entered. Possible values: 'binary', 'frequency', 'tfidf'")
feature_matrix = vectorizer.fit_transform(documents).astype(float)
return vectorizer, feature_matrix
vectorizer, feature_matrix = build_feature_matrix(filtered_list_6,
feature_type='tfidf',
min_df=0.15, max_df=0.85,
ngram_range=(1, 2))
def affinity_propagation(feature_matrix):
sim = feature_matrix * feature_matrix.T
sim = sim.todense()
ap = AffinityPropagation()
ap.fit(sim)
clusters = ap.labels_
return ap, clusters
ap_obj, clusters = affinity_propagation(feature_matrix=feature_matrix)
df[len(df.columns)] = clusters
c = Counter(clusters)
print(c.items())
total_clusters = len(c)
print('Total Clusters:', total_clusters)
有人可以指出我在这里做错了什么吗?
提前致谢!
我可以更改阻尼值、max_iter 和偏好值来消除该问题。最初你可以从 damping = 0.9, max_iter = 1000.
开始您可以根据需要更改偏好值,这将更改模型生成的聚类数