batch_size must be divisible by strategy.num_towers (17 vs 8) error when using Google Colab TPU in keras cross validation k-fold
batch_size must be divisible by strategy.num_towers (17 vs 8) error when using Google Colab TPU in keras cross validation k-fold
我想 运行 我在 TPU 上的 Keras 代码来对文本进行分类。当我的模型想要评估 Val_acc 时,会显示此错误:
batch_size must be divisible by strategy.num_towers (17 vs 8)
这是我的代码:
from __future__ import unicode_literals
import pandas as pd
import openpyxl
from hazm import *
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.feature_extraction.text import TfidfVectorizer
from keras.preprocessing.text import one_hot
from keras.preprocessing.sequence import pad_sequences
from tensorflow.keras import layers,models
from keras.preprocessing.text import Tokenizer
from keras.preprocessing import sequence
from keras import optimizers
from keras.callbacks import *
from keras.utils import to_categorical
from keras.callbacks import EarlyStopping
from sklearn.model_selection import StratifiedKFold
%matplotlib inline
我的数据:
X: texts
Y: labels
k倍:
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=7)
训练步骤:
epochs = 10
batch_size = 64
scores=[]
for train, test in kfold.split(X, Y):
model = RNN2()
tpu_model = tf.contrib.tpu.keras_to_tpu_model(
model,
strategy=tf.contrib.tpu.TPUDistributionStrategy(
tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://'
+ os.environ['COLAB_TPU_ADDR'])
)
)
tpu_model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
tpu_model.fit(X_sequences_matrix[train],Y[train],
batch_size=batch_size,
epochs=epochs,validation_split=0.15)
这些模型登录训练步骤:
Epoch 1/10
INFO:tensorflow:New input shapes; (re-)compiling: mode=train, [TensorSpec(shape=(4, 920), dtype=tf.float32, name='input_110'), TensorSpec(shape=(4, 1), dtype=tf.float32, name='activation_34_target0')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for input
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
INFO:tensorflow:Get updates: Tensor("loss_3/mul:0", shape=(), dtype=float32)
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 11.963297128677368 secs
INFO:tensorflow:Setting weights on TPU model.
3648/3697 [============================>.] - ETA: 1s - loss: 0.3725 - acc: 0.8728
AssertionError: batch_size must be divisible by strategy.num_towers (17 vs 8)
问题是您有 8 个 TPU 内核和 batch_size
17 个。如图所示 here 您的 batch_size
必须可以被 TPU 内核的数量整除。
我想 运行 我在 TPU 上的 Keras 代码来对文本进行分类。当我的模型想要评估 Val_acc 时,会显示此错误:
batch_size must be divisible by strategy.num_towers (17 vs 8)
这是我的代码:
from __future__ import unicode_literals
import pandas as pd
import openpyxl
from hazm import *
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.feature_extraction.text import TfidfVectorizer
from keras.preprocessing.text import one_hot
from keras.preprocessing.sequence import pad_sequences
from tensorflow.keras import layers,models
from keras.preprocessing.text import Tokenizer
from keras.preprocessing import sequence
from keras import optimizers
from keras.callbacks import *
from keras.utils import to_categorical
from keras.callbacks import EarlyStopping
from sklearn.model_selection import StratifiedKFold
%matplotlib inline
我的数据:
X: texts
Y: labels
k倍:
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=7)
训练步骤:
epochs = 10
batch_size = 64
scores=[]
for train, test in kfold.split(X, Y):
model = RNN2()
tpu_model = tf.contrib.tpu.keras_to_tpu_model(
model,
strategy=tf.contrib.tpu.TPUDistributionStrategy(
tf.contrib.cluster_resolver.TPUClusterResolver(tpu='grpc://'
+ os.environ['COLAB_TPU_ADDR'])
)
)
tpu_model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
tpu_model.fit(X_sequences_matrix[train],Y[train],
batch_size=batch_size,
epochs=epochs,validation_split=0.15)
这些模型登录训练步骤:
Epoch 1/10
INFO:tensorflow:New input shapes; (re-)compiling: mode=train, [TensorSpec(shape=(4, 920), dtype=tf.float32, name='input_110'), TensorSpec(shape=(4, 1), dtype=tf.float32, name='activation_34_target0')]
INFO:tensorflow:Overriding default placeholder.
INFO:tensorflow:Remapping placeholder for input
INFO:tensorflow:Cloning Adam {'lr': 0.0010000000474974513, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}
INFO:tensorflow:Get updates: Tensor("loss_3/mul:0", shape=(), dtype=float32)
INFO:tensorflow:Started compiling
INFO:tensorflow:Finished compiling. Time elapsed: 11.963297128677368 secs
INFO:tensorflow:Setting weights on TPU model.
3648/3697 [============================>.] - ETA: 1s - loss: 0.3725 - acc: 0.8728
AssertionError: batch_size must be divisible by strategy.num_towers (17 vs 8)
问题是您有 8 个 TPU 内核和 batch_size
17 个。如图所示 here 您的 batch_size
必须可以被 TPU 内核的数量整除。