将 dtype=object 转换为二进制值

Converting dtype=object to binary values

我有一个数据集,其中有一列包含两个不同的文本(PAIDOFF、COLLECTION),我想将其转换为二进制值,所以我尝试了以下操作:

y = df['loan_status'].values
y[0:5]

输出:

array(['PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF', 'PAIDOFF'],
  dtype=object)

定义目标列后,尝试将其转换为二进制值:

#Convert y to binary values
le_loan_status=preprocessing.LabelEncoder()
le_loan_status.fit(['PAIDOFF','COLLECTION'])
y[:,0]= le_loan_status.transform(y[:,0])

输出:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-10-917e44b54b88> in <module>
      2 le_loan_status=preprocessing.LabelEncoder()
      3 le_loan_status.fit(['PAIDOFF','COLLECTION'])
----> 4 y[:,0]= le_loan_status.transform(y[:,0])

IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

你有解决这个问题的办法吗?

转换为假人

dummies = pd.get_dummies(df["loan_status"],drop_first=True) 

new_data = pd.concat([df,dummies],axis=1)

Docs