LOOCV 中每个 "fold" 之后的模型拟合?手动多标签LOOCV
Model fit after each "fold" in LOOCV? Multilabel LOOCV by hand
我想知道在手动实施留一法交叉验证时是否存在问题(可能是数据泄漏?),如果在每次折叠测试后模型都适合每次迭代?似乎如果模型是针对 "X" 以外的所有数据进行训练的,并且在 "X" 上进行测试后,模型是针对 "Y" 以外的所有数据进行训练并在 "Y" 上进行测试的已在第一次迭代中看到 "Y"。这实际上是个问题吗?我手动实施的 LOOCV 是否正确?
感谢您的宝贵时间!
i = 0
j = 0
for i in range(0, 41):
X_copy = X_orig[(i):(i+1)] #Slice the ith element from the numpy array
y_copy = y_orig[(i):(i+1)]
X_model = X_orig
y_model = y_orig
X_model = np.delete(X_model, i, axis = 0)
y_model = np.delete(y_model, i, axis = 0)
model.fit(X_model, y_model, epochs=115, batch_size=28, verbose = 0) #verbose = 0 removes learning info
prediction = model.predict(X_copy)
prediction[prediction>=0.5] = 1
prediction[prediction<0.5] = 0
print(prediction, y_copy)
if np.array_equal(y_copy, prediction):
j = j + 1
#print(y_copy, prediction)
if np.not_equal:
#print(y_copy, prediction)
pass
print(j/41) #For 41 samples in dataset
你为什么不用这个?
from sklearn.model_selection import LeaveOneOut
loo = LeaveOneOut()
model =...
test_fold_predictions = []
for train_index, test_index in loo.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
model.fit(X_train, y_train)
test_fold_predictions.append(model.predict(X_test))
编辑
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(5000, activation='relu', input_dim=X_train.shape[1]))
model.add(Dropout(0.1))
model.add(Dense(600, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(y_train.shape[1], activation='sigmoid'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy',
optimizer=sgd)
from sklearn.model_selection import LeaveOneOut
loo = LeaveOneOut()
test_fold_predictions = []
for train_index, test_index in loo.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
model.fit(X_train, y_train, epochs=5, batch_size=2000)
test_fold_predictions.append(model.predict(X_test))
我想知道在手动实施留一法交叉验证时是否存在问题(可能是数据泄漏?),如果在每次折叠测试后模型都适合每次迭代?似乎如果模型是针对 "X" 以外的所有数据进行训练的,并且在 "X" 上进行测试后,模型是针对 "Y" 以外的所有数据进行训练并在 "Y" 上进行测试的已在第一次迭代中看到 "Y"。这实际上是个问题吗?我手动实施的 LOOCV 是否正确?
感谢您的宝贵时间!
i = 0
j = 0
for i in range(0, 41):
X_copy = X_orig[(i):(i+1)] #Slice the ith element from the numpy array
y_copy = y_orig[(i):(i+1)]
X_model = X_orig
y_model = y_orig
X_model = np.delete(X_model, i, axis = 0)
y_model = np.delete(y_model, i, axis = 0)
model.fit(X_model, y_model, epochs=115, batch_size=28, verbose = 0) #verbose = 0 removes learning info
prediction = model.predict(X_copy)
prediction[prediction>=0.5] = 1
prediction[prediction<0.5] = 0
print(prediction, y_copy)
if np.array_equal(y_copy, prediction):
j = j + 1
#print(y_copy, prediction)
if np.not_equal:
#print(y_copy, prediction)
pass
print(j/41) #For 41 samples in dataset
你为什么不用这个?
from sklearn.model_selection import LeaveOneOut
loo = LeaveOneOut()
model =...
test_fold_predictions = []
for train_index, test_index in loo.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
model.fit(X_train, y_train)
test_fold_predictions.append(model.predict(X_test))
编辑
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
model = Sequential()
model.add(Dense(5000, activation='relu', input_dim=X_train.shape[1]))
model.add(Dropout(0.1))
model.add(Dense(600, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(y_train.shape[1], activation='sigmoid'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy',
optimizer=sgd)
from sklearn.model_selection import LeaveOneOut
loo = LeaveOneOut()
test_fold_predictions = []
for train_index, test_index in loo.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
model.fit(X_train, y_train, epochs=5, batch_size=2000)
test_fold_predictions.append(model.predict(X_test))