如何在 python 中实现 KNN?

How to implement KNN in python?

我想在 python 中实现 KNN。到目前为止,我已经将数据加载到 Pandas DataFrame 中。

import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
train_df = pd.read_csv("creditlimit_train.csv") # train dataset
train_df.head()

head 的输出是

SNo      Salary      LoanAmt   Level
101      100000      10000     Low Level
102      108500      11176     Low Level
103      125500      13303     Low Level
104      134000      14606     Low Level
105      142500      15960     Low Level


test_df = pd.read_csv("creditlimit_test.csv")
test_df.head()

head 的输出是

SNo      Salary      LoanAmt   Level
101      100000      10000     Low Level
102      108500      11176     Low Level
103      125500      13303     Low Level
104      134000      14606     Low Level
105      142500      15960     Low Level

neigh = KNeighborsClassifier(n_neighbors=5,algorithm='auto')
predictor_features = ['Salary','LoanAmt']
dependent_features = ['Level']
neigh.fit(train_df[predictor_features],train_df[dependent_features])

如何使用 fit 函数将 salary,loanAmt 作为预测变量来预测我的 test_df 的水平?

更新 1:级别为 3:低、中和高

您可以将 DataFrame 转换为 numpy 数组并作为输入传递

# convert class labels in numerical data, assuming you have two classes
df['Level'].replace(['Low Level'],0)
df['Level'].replace(['High Level'],1)

# extra data and class labels
data = df[['Salary','LoanAmt']]
target = df['Level']

# convert df to numpy arrays
data = data.values
target =  target.values

# you would ideally want to do a test train split.
#Train the model on training data and test on the test data for accuracy

#pass in fit function
neigh = KNeighborsClassifier(n_neighbors=5,algorithm='auto')
neigh.fit(data,target) ## how to passs the parameters here?

一些有用的链接:

Convert pandas dataframe to numpy array, preserving index

Replacing few values in a pandas dataframe column with another value

Selecting columns in a pandas dataframe