如何使用自动编码器初始化 MLP 的权重 #2nd 部分 - 深度自动编码器 #3rd 部分 - 堆叠式自动编码器
How to initialise weights of a MLP using an autoencoder #2nd part - Deep autoencoder #3rd part - Stacked autoencoder
我构建了一个自动编码器(1 个编码器 8:5,1 个解码器 5:8),它采用 Pima-Indian-Diabetes 数据集(https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv) and reduces its dimension (from 8 to 5). I would now like to use these reduced features to classify the data using an mlp. Now, here, I have some problems with the basic understanding of the architecture. How do I use the weights of the autoencoder and feed them into the mlp? I have checked these threads - https://github.com/keras-team/keras/issues/91 and https://www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g)。这里的问题是哪个权重矩阵我应该考虑吗?编码器部分还是解码器部分?当我为 mlp 添加层时,如何使用这些保存的权重初始化权重,而不是获得确切的语法。此外,我的 mlp 应该从 5 个神经元开始因为我的降维是 5?对于这个二元分类问题,mlp 的可能维度是多少?如果有人可以详细说明,请问?
deep autoencoder代码如下:
# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy
# Data pre-processing...
# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]
# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
X, Y, test_size=0.2, random_state=42)
# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)
# Autoencoder code begins here...
encoding_dim1 = 5 # size of encoded representations
encoding_dim2 = 3 # size of encoded representations in the bottleneck layer
# this is our input placeholder
input_data = Input(shape=(8,))
# "encoded" is the first encoded representation of the input
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
# "enc" is the second encoded representation of the input
enc = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
# "dec" is the lossy reconstruction of the input
dec = Dense(encoding_dim1, activation='sigmoid', name='decoder1')(enc)
# "decoded" is the final lossy reconstruction of the input
decoded = Dense(8, activation='sigmoid', name='decoder2')(dec)
# this model maps an input to its reconstruction
autoencoder = Model(inputs=input_data, outputs=decoded)
autoencoder.compile(optimizer='sgd', loss='mse')
# training
autoencoder.fit(x_train, x_train,
epochs=300,
batch_size=10,
shuffle=True,
validation_data=(x_test, x_test)) # need more tuning
# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)
#The stacked autoencoder code is as follows:
# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy
# Data pre-processing...
# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]
# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
X, Y, test_size=0.2, random_state=42)
# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)
# Autoencoder code goes here...
encoding_dim1 = 5 # size of encoded representations
encoding_dim2 = 3 # size of encoded representations in the bottleneck layer
# this is our input placeholder
input_data1 = Input(shape=(8,))
# the first encoded representation of the input
encoded1 = Dense(encoding_dim1, activation='relu',
name='encoder1')(input_data1)
# the first lossy reconstruction of the input
decoded1 = Dense(8, activation='sigmoid', name='decoder1')(encoded1)
# this model maps an input to its first layer of reconstructions
autoencoder1 = Model(inputs=input_data1, outputs=decoded1)
# this is the first encoder model
enc1 = Model(inputs=input_data1, outputs=encoded1)
autoencoder1.compile(optimizer='sgd', loss='mse')
# training
autoencoder1.fit(x_train, x_train, epochs=300,
batch_size=10, shuffle=True,
validation_data=(x_test, x_test))
FirstAEoutput = autoencoder1.predict(x_train)
input_data2 = Input(shape=(encoding_dim1,))
# the second encoded representations of the input
encoded2 = Dense(encoding_dim2, activation='relu',
name='encoder2')(input_data2)
# the final lossy reconstruction of the input
decoded2 = Dense(encoding_dim1, activation='sigmoid',
name='decoder2')(encoded2)
# this model maps an input to its second layer of reconstructions
autoencoder2 = Model(inputs=input_data2, outputs=decoded2)
# this is the second encoder
enc2 = Model(inputs=input_data2, outputs=encoded2)
autoencoder2.compile(optimizer='sgd', loss='mse')
# training
autoencoder2.fit(FirstAEoutput, FirstAEoutput, epochs=300,
batch_size=10, shuffle=True)
# this is the overall autoencoder mapping an input to its final reconstructions
autoencoder = Model(inputs=input_data1, outputs=encoded2)
# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)
这么多问题。你试过什么了?代码片段?
如果您的解码器正在尝试重建输入,那么将您的分类器附加到它的输出对我来说真的没有意义。我的意思是,为什么不在第一时间将它附加到输入中呢?因此,如果您打算使用 auto-encoder,我会说很明显您应该将分类器附加到编码器管道的输出。
我不太清楚你说 "use the weights of the autoencoder and feed them into the mlp" 是什么意思。您不会为一个层提供另一层的权重,而是使用它的 输出信号 。这在 Keras 上很容易做到。假设您定义了 auto-encoder 并对其进行了训练:
from keras Input, Model
from keras import backend as K
from keras.layers import Dense
x = Input(shape=[8])
y = Dense(5, activation='sigmoid' name='encoder')(x)
y = Dense(8, name='decoder')(y)
ae = Model(inputs=x, outputs=y)
ae.compile(loss='mse', ...)
ae.fit(x_train, x_train, ...)
K.models.save_model(ae, './autoencoder.h5')
然后您可以在编码器上附加一个分类层,并使用以下代码创建分类器模型:
# load the model from the disk if you
# are in a different execution.
ae = K.models.load_model('./autoencoder.h5')
y = ae.get_layer('encoder').output
y = Dense(1, activation='sigmoid', name='predictions')(y)
classifier = Model(inputs=ae.inputs, outputs=y)
classifier.compile(loss='binary_crossentropy', ...)
classifier.fit(x_train, y_train, ...)
就是这样,真的。 classifier
模型现在将 ae
模型的第一个嵌入层 encoder 作为其第一层,然后是 sigmoid
决策层 预测.
如果您真正想做的是使用 auto-encoder 学习的权重来初始化分类器的权重(我不是肯定的,我推荐这种方法):
你可以用layer#get_weights
取权重矩阵,修剪它(因为编码器有5个单元而分类器只有1个)最后设置分类器权重。以下几行内容:
w, b = ae.get_layer('encoder').get_weights()
# remove all units except by one.
neuron_to_keep = 2
w = w[:, neuron_to_keep:neuron_to_keep + 1]
b = b[neuron_to_keep:neuron_to_keep + 1]
classifier.get_layer('predictions').set_weights(w, b)
Idavid,供您参考 - MLP using Autoencoder reduced features。我需要了解哪个数字是正确的?抱歉,我不得不上传图片作为答案,因为没有通过评论上传图片的选项。我想你是说图 B 是正确的。这是相同的代码片段。如果我走对了,请告诉我。
# This is a mlp classification code with features reduced by an Autoencoder
# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy
# Data pre-processing...
# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]
# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
X, Y, test_size=0.2, random_state=42)
# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)
# Autoencoder code goes here...
encoding_dim = 5 # size of our encoded representations
# this is our input placeholder
input_data = Input(shape=(8,))
# "encoded" is the encoded representation of the input
encoded = Dense(encoding_dim, activation='relu', name='encoder')(input_data)
# "decoded" is the lossy reconstruction of the input
decoded = Dense(8, activation='sigmoid', name='decoder')(encoded)
# this model maps an input to its reconstruction
autoencoder = Model(inputs=input_data, outputs=decoded)
autoencoder.compile(optimizer='sgd', loss='mse')
# training
autoencoder.fit(x_train, x_train,
epochs=300,
batch_size=10,
shuffle=True,
validation_data=(x_test, x_test)) # need more tuning
# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)
# MLP code goes here...
# create model
x = autoencoder.get_layer('encoder').output
# h = Dense(3, activation='relu', name='hidden')(x)
y = Dense(1, activation='sigmoid', name='predictions')(x)
classifier = Model(inputs=autoencoder.inputs, outputs=y)
# Compile model
classifier.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])
# Fit the model
classifier.fit(x_train, y_train, epochs=250, batch_size=10)
print('Now making predictions')
predictions = classifier.predict(x_test)
# round predictions
rounded_predicted_classes = [round(x[0]) for x in predictions]
temp = sum(y_test == rounded_predicted_classes)
acc = temp/len(y_test)
print(acc)
我构建了一个自动编码器(1 个编码器 8:5,1 个解码器 5:8),它采用 Pima-Indian-Diabetes 数据集(https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv) and reduces its dimension (from 8 to 5). I would now like to use these reduced features to classify the data using an mlp. Now, here, I have some problems with the basic understanding of the architecture. How do I use the weights of the autoencoder and feed them into the mlp? I have checked these threads - https://github.com/keras-team/keras/issues/91 and https://www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g)。这里的问题是哪个权重矩阵我应该考虑吗?编码器部分还是解码器部分?当我为 mlp 添加层时,如何使用这些保存的权重初始化权重,而不是获得确切的语法。此外,我的 mlp 应该从 5 个神经元开始因为我的降维是 5?对于这个二元分类问题,mlp 的可能维度是多少?如果有人可以详细说明,请问?
deep autoencoder代码如下:
# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy
# Data pre-processing...
# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]
# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
X, Y, test_size=0.2, random_state=42)
# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)
# Autoencoder code begins here...
encoding_dim1 = 5 # size of encoded representations
encoding_dim2 = 3 # size of encoded representations in the bottleneck layer
# this is our input placeholder
input_data = Input(shape=(8,))
# "encoded" is the first encoded representation of the input
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
# "enc" is the second encoded representation of the input
enc = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
# "dec" is the lossy reconstruction of the input
dec = Dense(encoding_dim1, activation='sigmoid', name='decoder1')(enc)
# "decoded" is the final lossy reconstruction of the input
decoded = Dense(8, activation='sigmoid', name='decoder2')(dec)
# this model maps an input to its reconstruction
autoencoder = Model(inputs=input_data, outputs=decoded)
autoencoder.compile(optimizer='sgd', loss='mse')
# training
autoencoder.fit(x_train, x_train,
epochs=300,
batch_size=10,
shuffle=True,
validation_data=(x_test, x_test)) # need more tuning
# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)
#The stacked autoencoder code is as follows:
# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy
# Data pre-processing...
# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]
# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
X, Y, test_size=0.2, random_state=42)
# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)
# Autoencoder code goes here...
encoding_dim1 = 5 # size of encoded representations
encoding_dim2 = 3 # size of encoded representations in the bottleneck layer
# this is our input placeholder
input_data1 = Input(shape=(8,))
# the first encoded representation of the input
encoded1 = Dense(encoding_dim1, activation='relu',
name='encoder1')(input_data1)
# the first lossy reconstruction of the input
decoded1 = Dense(8, activation='sigmoid', name='decoder1')(encoded1)
# this model maps an input to its first layer of reconstructions
autoencoder1 = Model(inputs=input_data1, outputs=decoded1)
# this is the first encoder model
enc1 = Model(inputs=input_data1, outputs=encoded1)
autoencoder1.compile(optimizer='sgd', loss='mse')
# training
autoencoder1.fit(x_train, x_train, epochs=300,
batch_size=10, shuffle=True,
validation_data=(x_test, x_test))
FirstAEoutput = autoencoder1.predict(x_train)
input_data2 = Input(shape=(encoding_dim1,))
# the second encoded representations of the input
encoded2 = Dense(encoding_dim2, activation='relu',
name='encoder2')(input_data2)
# the final lossy reconstruction of the input
decoded2 = Dense(encoding_dim1, activation='sigmoid',
name='decoder2')(encoded2)
# this model maps an input to its second layer of reconstructions
autoencoder2 = Model(inputs=input_data2, outputs=decoded2)
# this is the second encoder
enc2 = Model(inputs=input_data2, outputs=encoded2)
autoencoder2.compile(optimizer='sgd', loss='mse')
# training
autoencoder2.fit(FirstAEoutput, FirstAEoutput, epochs=300,
batch_size=10, shuffle=True)
# this is the overall autoencoder mapping an input to its final reconstructions
autoencoder = Model(inputs=input_data1, outputs=encoded2)
# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)
这么多问题。你试过什么了?代码片段?
如果您的解码器正在尝试重建输入,那么将您的分类器附加到它的输出对我来说真的没有意义。我的意思是,为什么不在第一时间将它附加到输入中呢?因此,如果您打算使用 auto-encoder,我会说很明显您应该将分类器附加到编码器管道的输出。
我不太清楚你说 "use the weights of the autoencoder and feed them into the mlp" 是什么意思。您不会为一个层提供另一层的权重,而是使用它的 输出信号 。这在 Keras 上很容易做到。假设您定义了 auto-encoder 并对其进行了训练:
from keras Input, Model
from keras import backend as K
from keras.layers import Dense
x = Input(shape=[8])
y = Dense(5, activation='sigmoid' name='encoder')(x)
y = Dense(8, name='decoder')(y)
ae = Model(inputs=x, outputs=y)
ae.compile(loss='mse', ...)
ae.fit(x_train, x_train, ...)
K.models.save_model(ae, './autoencoder.h5')
然后您可以在编码器上附加一个分类层,并使用以下代码创建分类器模型:
# load the model from the disk if you
# are in a different execution.
ae = K.models.load_model('./autoencoder.h5')
y = ae.get_layer('encoder').output
y = Dense(1, activation='sigmoid', name='predictions')(y)
classifier = Model(inputs=ae.inputs, outputs=y)
classifier.compile(loss='binary_crossentropy', ...)
classifier.fit(x_train, y_train, ...)
就是这样,真的。 classifier
模型现在将 ae
模型的第一个嵌入层 encoder 作为其第一层,然后是 sigmoid
决策层 预测.
如果您真正想做的是使用 auto-encoder 学习的权重来初始化分类器的权重(我不是肯定的,我推荐这种方法):
你可以用layer#get_weights
取权重矩阵,修剪它(因为编码器有5个单元而分类器只有1个)最后设置分类器权重。以下几行内容:
w, b = ae.get_layer('encoder').get_weights()
# remove all units except by one.
neuron_to_keep = 2
w = w[:, neuron_to_keep:neuron_to_keep + 1]
b = b[neuron_to_keep:neuron_to_keep + 1]
classifier.get_layer('predictions').set_weights(w, b)
Idavid,供您参考 - MLP using Autoencoder reduced features。我需要了解哪个数字是正确的?抱歉,我不得不上传图片作为答案,因为没有通过评论上传图片的选项。我想你是说图 B 是正确的。这是相同的代码片段。如果我走对了,请告诉我。
# This is a mlp classification code with features reduced by an Autoencoder
# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy
# Data pre-processing...
# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]
# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
X, Y, test_size=0.2, random_state=42)
# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)
# Autoencoder code goes here...
encoding_dim = 5 # size of our encoded representations
# this is our input placeholder
input_data = Input(shape=(8,))
# "encoded" is the encoded representation of the input
encoded = Dense(encoding_dim, activation='relu', name='encoder')(input_data)
# "decoded" is the lossy reconstruction of the input
decoded = Dense(8, activation='sigmoid', name='decoder')(encoded)
# this model maps an input to its reconstruction
autoencoder = Model(inputs=input_data, outputs=decoded)
autoencoder.compile(optimizer='sgd', loss='mse')
# training
autoencoder.fit(x_train, x_train,
epochs=300,
batch_size=10,
shuffle=True,
validation_data=(x_test, x_test)) # need more tuning
# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)
# MLP code goes here...
# create model
x = autoencoder.get_layer('encoder').output
# h = Dense(3, activation='relu', name='hidden')(x)
y = Dense(1, activation='sigmoid', name='predictions')(x)
classifier = Model(inputs=autoencoder.inputs, outputs=y)
# Compile model
classifier.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])
# Fit the model
classifier.fit(x_train, y_train, epochs=250, batch_size=10)
print('Now making predictions')
predictions = classifier.predict(x_test)
# round predictions
rounded_predicted_classes = [round(x[0]) for x in predictions]
temp = sum(y_test == rounded_predicted_classes)
acc = temp/len(y_test)
print(acc)