Jupyter Notebook 多单元格问题

Question

我目前有这段代码：

import pandas as pd
import re

import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.stem.porter import PorterStemmer

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score

####################################################################################

nltk.download('punkt')
nltk.download('stopwords')

dataset = pd.read_csv('car_reviews.csv')
ps = PorterStemmer()

####################################################################################

data = []

for i in range(dataset.shape[0]):
    text = dataset.iloc[i, 1]
    text = re.sub('[^A-Za-z]', ' ', text)
    text = text.lower()
    tokenized_text = word_tokenize(text)
    
    processed_text = [ps.stem(word) for word in tokenized_text if word not in set(stopwords.words('english'))]
            
    final_text = " ".join(processed_text)
    data.append(final_text)

####################################################################################

matrix = CountVectorizer()
X = matrix.fit_transform(data).toarray()
Y = dataset.iloc[:, 0]

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2)

print('The number of reviews in the training set is: ' + str(len(X_train)) + '.')
print('The number of reviews in the test set is: ' + str(len(X_test)) + '.')

####################################################################################

classifier = MultinomialNB()
classifier.fit(X_train, Y_train)

Y_pred = classifier.predict(X_test)

cf_matrix = confusion_matrix(Y_test, Y_pred)
classification_report = classification_report(Y_test, Y_pred)
accuracy = accuracy_score(Y_test, Y_pred)

print('Accuracy: %.2f%% ' % (accuracy * 100.0))

# 表示新单元格。所以我们总共有 5 个单元格。当我重新启动笔记本时，一切运行都很好，我得到了一个输出。但是，当我只是运行多项式朴素贝叶斯的最后一个单元格时，我得到一个 numpy.ndarray 错误，说对象对于我的混淆矩阵是不可调用的，我不知道为什么。我将如何解决这个问题？

Answer 1

这是由于我糟糕的编程习惯，将 confusion_matrix 和 classification_report 的变量重命名为同一件事。改了变量名后就正常了

Jupyter Notebook 多单元格问题

Jupyter Notebook Multiple Cell Issue

python

jupyter-notebook

numpy-ndarray