AttributeError: 'TfidfVectorizer' object has no attribute 'get_feature_names_out'
AttributeError: 'TfidfVectorizer' object has no attribute 'get_feature_names_out'
为什么我总是收到这个错误?我也尝试了其他代码,但是一旦它使用 get_feature_names_out
函数就会弹出这个错误。
下面是我的代码:
from sklearn.datasets._twenty_newsgroups import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB # fast to train and achieves a decent F-score
from sklearn import metrics
import numpy as np
def show_top10(classifier, vectorizer, categories):
feature_names = vectorizer.get_feature_names_out()
for i, category in enumerate(categories):
top10 = np.argsort(classifier.coef_[i])[-10:]
print("%s: %s" % (category, " ".join(feature_names[top10])))
newsgroups_train = fetch_20newsgroups(subset='train')
print(list(newsgroups_train.target_names))
cats = ['alt.atheism', 'sci.space', 'rec.sport.baseball', 'rec.sport.hockey']
newsgroups_train = fetch_20newsgroups(subset='train', categories=cats)
print(list(newsgroups_train.target_names))
print(newsgroups_train.filenames.shape)
vectorizer = TfidfVectorizer()
vectors = vectorizer.fit_transform(newsgroups_train.data)
print(vectors.shape)
这可能是因为您使用的 scikit-learn 版本比编写此代码的版本要旧。
get_feature_names_out
是 class sklearn.feature_extraction.text.TfidfVectorizer
自 scikit-learn 1.0 以来的一种方法。之前有个类似的方法叫get_feature_names
.
所以你应该更新你的 scikit-learn 包,或者使用旧方法(不推荐)。
为什么我总是收到这个错误?我也尝试了其他代码,但是一旦它使用 get_feature_names_out
函数就会弹出这个错误。
下面是我的代码:
from sklearn.datasets._twenty_newsgroups import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB # fast to train and achieves a decent F-score
from sklearn import metrics
import numpy as np
def show_top10(classifier, vectorizer, categories):
feature_names = vectorizer.get_feature_names_out()
for i, category in enumerate(categories):
top10 = np.argsort(classifier.coef_[i])[-10:]
print("%s: %s" % (category, " ".join(feature_names[top10])))
newsgroups_train = fetch_20newsgroups(subset='train')
print(list(newsgroups_train.target_names))
cats = ['alt.atheism', 'sci.space', 'rec.sport.baseball', 'rec.sport.hockey']
newsgroups_train = fetch_20newsgroups(subset='train', categories=cats)
print(list(newsgroups_train.target_names))
print(newsgroups_train.filenames.shape)
vectorizer = TfidfVectorizer()
vectors = vectorizer.fit_transform(newsgroups_train.data)
print(vectors.shape)
这可能是因为您使用的 scikit-learn 版本比编写此代码的版本要旧。
get_feature_names_out
是 class sklearn.feature_extraction.text.TfidfVectorizer
自 scikit-learn 1.0 以来的一种方法。之前有个类似的方法叫get_feature_names
.
所以你应该更新你的 scikit-learn 包,或者使用旧方法(不推荐)。