ValueError: Found input variables with inconsistent numbers of samples: [143, 426]
ValueError: Found input variables with inconsistent numbers of samples: [143, 426]
我该如何修复它抛出的这个错误? ValueError:发现样本数量不一致的输入变量:[143, 426]
#split the data set into independent (X) and dependent (Y) data sets
X = df.iloc[:,2:31].values
Y = df.iloc[:,1].values
#split the data qet into 75% training and 25% testing
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.25, random_state = 0)
#scale the data (feature scaling)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_train = sc.fit_transform(X_test)
#Using Logistic Regression Algorithm to the Training Set
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, Y_train)
和X_train的形状,Y_train:
X_train.shape
(143, 29)
Y_train.shape
(426,)
错误信息:
ValueError Traceback(最后一次调用)
在 ()
2个
3 classifier = LogisticRegression(random_state = 0)
----> 4 classifier.fit(X_train, Y_train)
5 #Using KNeighborsClassifier Method of neighbors class 使用最近邻算法
6
2 帧
/usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py 在 check_consistent_length(*arrays)
210 如果 len(uniques) > 1:
211 raise ValueError("发现输入变量的数量不一致"
--> 212 " samples: %r" % [int(l) for l in lengths])
213
214
ValueError:发现样本数量不一致的输入变量:[143, 426]
您在第 11 行有一个错误,您分配给 X_train 而不是 X_test。看看下面更正后的代码。
#split the data set into independent (X) and dependent (Y) data sets
X = df.iloc[:,2:31].values
Y = df.iloc[:,1].values
#split the data qet into 75% training and 25% testing
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.25, random_state = 0)
#scale the data (feature scaling)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
#Using Logistic Regression Algorithm to the Training Set
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, Y_train)
也不要在 X_test 上使用 fit_transform。您想要使用与 X_train.
中计算的相同的均值和标准差
我该如何修复它抛出的这个错误? ValueError:发现样本数量不一致的输入变量:[143, 426]
#split the data set into independent (X) and dependent (Y) data sets
X = df.iloc[:,2:31].values
Y = df.iloc[:,1].values
#split the data qet into 75% training and 25% testing
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.25, random_state = 0)
#scale the data (feature scaling)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_train = sc.fit_transform(X_test)
#Using Logistic Regression Algorithm to the Training Set
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, Y_train)
和X_train的形状,Y_train:
X_train.shape
(143, 29)
Y_train.shape
(426,)
错误信息: ValueError Traceback(最后一次调用) 在 () 2个 3 classifier = LogisticRegression(random_state = 0) ----> 4 classifier.fit(X_train, Y_train) 5 #Using KNeighborsClassifier Method of neighbors class 使用最近邻算法 6
2 帧 /usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py 在 check_consistent_length(*arrays) 210 如果 len(uniques) > 1: 211 raise ValueError("发现输入变量的数量不一致" --> 212 " samples: %r" % [int(l) for l in lengths]) 213 214
ValueError:发现样本数量不一致的输入变量:[143, 426]
您在第 11 行有一个错误,您分配给 X_train 而不是 X_test。看看下面更正后的代码。
#split the data set into independent (X) and dependent (Y) data sets
X = df.iloc[:,2:31].values
Y = df.iloc[:,1].values
#split the data qet into 75% training and 25% testing
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.25, random_state = 0)
#scale the data (feature scaling)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
#Using Logistic Regression Algorithm to the Training Set
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, Y_train)
也不要在 X_test 上使用 fit_transform。您想要使用与 X_train.
中计算的相同的均值和标准差