如何在 python 中实现 KNN?
How to implement KNN in python?
我想在 python 中实现 KNN。到目前为止,我已经将数据加载到 Pandas DataFrame 中。
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
train_df = pd.read_csv("creditlimit_train.csv") # train dataset
train_df.head()
head 的输出是
SNo Salary LoanAmt Level
101 100000 10000 Low Level
102 108500 11176 Low Level
103 125500 13303 Low Level
104 134000 14606 Low Level
105 142500 15960 Low Level
test_df = pd.read_csv("creditlimit_test.csv")
test_df.head()
head 的输出是
SNo Salary LoanAmt Level
101 100000 10000 Low Level
102 108500 11176 Low Level
103 125500 13303 Low Level
104 134000 14606 Low Level
105 142500 15960 Low Level
neigh = KNeighborsClassifier(n_neighbors=5,algorithm='auto')
predictor_features = ['Salary','LoanAmt']
dependent_features = ['Level']
neigh.fit(train_df[predictor_features],train_df[dependent_features])
如何使用 fit 函数将 salary,loanAmt 作为预测变量来预测我的 test_df 的水平?
更新 1:级别为 3:低、中和高
您可以将 DataFrame 转换为 numpy 数组并作为输入传递
# convert class labels in numerical data, assuming you have two classes
df['Level'].replace(['Low Level'],0)
df['Level'].replace(['High Level'],1)
# extra data and class labels
data = df[['Salary','LoanAmt']]
target = df['Level']
# convert df to numpy arrays
data = data.values
target = target.values
# you would ideally want to do a test train split.
#Train the model on training data and test on the test data for accuracy
#pass in fit function
neigh = KNeighborsClassifier(n_neighbors=5,algorithm='auto')
neigh.fit(data,target) ## how to passs the parameters here?
一些有用的链接:
Convert pandas dataframe to numpy array, preserving index
Replacing few values in a pandas dataframe column with another value
Selecting columns in a pandas dataframe
我想在 python 中实现 KNN。到目前为止,我已经将数据加载到 Pandas DataFrame 中。
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
train_df = pd.read_csv("creditlimit_train.csv") # train dataset
train_df.head()
head 的输出是
SNo Salary LoanAmt Level
101 100000 10000 Low Level
102 108500 11176 Low Level
103 125500 13303 Low Level
104 134000 14606 Low Level
105 142500 15960 Low Level
test_df = pd.read_csv("creditlimit_test.csv")
test_df.head()
head 的输出是
SNo Salary LoanAmt Level
101 100000 10000 Low Level
102 108500 11176 Low Level
103 125500 13303 Low Level
104 134000 14606 Low Level
105 142500 15960 Low Level
neigh = KNeighborsClassifier(n_neighbors=5,algorithm='auto')
predictor_features = ['Salary','LoanAmt']
dependent_features = ['Level']
neigh.fit(train_df[predictor_features],train_df[dependent_features])
如何使用 fit 函数将 salary,loanAmt 作为预测变量来预测我的 test_df 的水平?
更新 1:级别为 3:低、中和高
您可以将 DataFrame 转换为 numpy 数组并作为输入传递
# convert class labels in numerical data, assuming you have two classes
df['Level'].replace(['Low Level'],0)
df['Level'].replace(['High Level'],1)
# extra data and class labels
data = df[['Salary','LoanAmt']]
target = df['Level']
# convert df to numpy arrays
data = data.values
target = target.values
# you would ideally want to do a test train split.
#Train the model on training data and test on the test data for accuracy
#pass in fit function
neigh = KNeighborsClassifier(n_neighbors=5,algorithm='auto')
neigh.fit(data,target) ## how to passs the parameters here?
一些有用的链接:
Convert pandas dataframe to numpy array, preserving index
Replacing few values in a pandas dataframe column with another value
Selecting columns in a pandas dataframe