AttributeError: 'Series' object has no attribute 'lower' Pandas

Question

我正在尝试进行类别预测，基本上我有这 3 列 'First Name'、'Last Name'、'Gender'，我的目标是预测输入变量的类别 'test_x' 所以在下面的代码中我插入了 'Male' 作为我的输入并且我期望 'Gender' 作为我的输出但是我得到了这个错误：AttributeError: 'Series' object has no attribute 'lower'.

import pandas as pd 
import nltk

class Employee_Category:
    FIRST_NAME = "FIRST_NAME"
    LAST_NAME = "LAST_NAME"
    GENDER = "GENDER"

data = pd.read_excel("C:\users\HP\Documents\Datascience task\Employee.xlsx")
data = data.drop(['Age','Experience (Years)','Salary'],axis='columns')

train_x = [data['First Name'],data['Last Name'],data['Gender']]
train_y = [Employee_Category.FIRST_NAME,Employee_Category.LAST_NAME,Employee_Category.GENDER]

from sklearn.feature_extraction.text import CountVectorizer

vectorizer = CountVectorizer(binary=True)
vector = vectorizer.fit_transform(train_x)

# Train the model
clf_svm = svm.SVC(kernel='linear')
clf_svm.fit(vector,train_y)

# Predict
test_x = vectorizer.transform(['Male']) # Expected output: "GENDER"
clf_svm.predict(test_x)

这是数据集的头部：

我已经进行了几次谷歌搜索，但我无法解决错误，我什至一开始也不理解错误，所以请帮助解释发生这种情况的原因！

Answer 1

这里的问题是您必须展平输入矩阵，以便为每个单词分配一个标签。下面的代码对我有用：

import pandas as pd 
import numpy as np
from sklearn.svm import SVC
from sklearn.feature_extraction.text import CountVectorizer

class Employee_Category:
    FIRST_NAME = "FIRST_NAME"
    LAST_NAME = "LAST_NAME"
    GENDER = "GENDER"

data = pd.DataFrame(columns=['First Name','Last Name','Gender'])
data.loc[0,:] = ['Arnold','Carter','Male']
data.loc[1,:] = ['Arthur','Farrell','Male']
data.loc[2,:] = ['Richard','Perry','Male']
data.loc[3,:] = ['Ellia','Thomas','Female']

train_x = data.to_numpy().flatten()
train_y = len(data)*[Employee_Category.FIRST_NAME,Employee_Category.LAST_NAME,Employee_Category.GENDER]

vectorizer = CountVectorizer(binary=True)
vector = vectorizer.fit_transform(train_x)

# Train the model
clf_svm = SVC(kernel='linear')
clf_svm.fit(vector,train_y)

# Predict
test_x = vectorizer.transform(['Male']) # Expected output: "GENDER"
print(clf_svm.predict(test_x))

returns: ['GENDER']

AttributeError: 'Series' object has no attribute 'lower' Pandas

AttributeError: 'Series' object has no attribute 'lower' Pandas

python

pandas

scikit-learn