RuntimeError: The size of tensor a (4000) must match the size of tensor b (512) at non-singleton dimension 1
RuntimeError: The size of tensor a (4000) must match the size of tensor b (512) at non-singleton dimension 1
我正在尝试构建文档分类模型。我正在使用 BERT
和 PyTorch
。
我用下面的代码得到了bert模型。
bert = AutoModel.from_pretrained('bert-base-uncased')
这是训练代码。
for epoch in range(epochs):
print('\n Epoch {:} / {:}'.format(epoch + 1, epochs))
#train model
train_loss, _ = modhelper.train(proc.train_dataloader)
#evaluate model
valid_loss, _ = modhelper.evaluate()
#save the best model
if valid_loss < best_valid_loss:
best_valid_loss = valid_loss
torch.save(modhelper.model.state_dict(), 'saved_weights.pt')
# append training and validation loss
train_losses.append(train_loss)
valid_losses.append(valid_loss)
print(f'\nTraining Loss: {train_loss:.3f}')
print(f'Validation Loss: {valid_loss:.3f}')
这是我的训练方法,可通过对象 modhelper
.
访问
def train(self, train_dataloader):
self.model.train()
total_loss, total_accuracy = 0, 0
# empty list to save model predictions
total_preds=[]
# iterate over batches
for step, batch in enumerate(train_dataloader):
# progress update after every 50 batches.
if step % 50 == 0 and not step == 0:
print(' Batch {:>5,} of {:>5,}.'.format(step, len(train_dataloader)))
# push the batch to gpu
#batch = [r.to(device) for r in batch]
sent_id, mask, labels = batch
# clear previously calculated gradients
self.model.zero_grad()
print(sent_id.size(), mask.size())
# get model predictions for the current batch
preds = self.model(sent_id, mask) #This line throws the error
# compute the loss between actual and predicted values
self.loss = self.cross_entropy(preds, labels)
# add on to the total loss
total_loss = total_loss + self.loss.item()
# backward pass to calculate the gradients
self.loss.backward()
# clip the the gradients to 1.0. It helps in preventing the exploding gradient problem
torch.nn.utils.clip_grad_norm_(self.model.parameters(), 1.0)
# update parameters
self.optimizer.step()
# model predictions are stored on GPU. So, push it to CPU
#preds=preds.detach().cpu().numpy()
# append the model predictions
total_preds.append(preds)
# compute the training loss of the epoch
avg_loss = total_loss / len(train_dataloader)
# predictions are in the form of (no. of batches, size of batch, no. of classes).
# reshape the predictions in form of (number of samples, no. of classes)
total_preds = np.concatenate(total_preds, axis=0)
#returns the loss and predictions
return avg_loss, total_preds
preds = self.model(sent_id, mask)
此行抛出以下错误(包括完整回溯)。
Epoch 1 / 1
torch.Size([32, 4000]) torch.Size([32, 4000])
Traceback (most recent call last):
File "<ipython-input-39-17211d5a107c>", line 8, in <module>
train_loss, _ = modhelper.train(proc.train_dataloader)
File "E:\BertTorch\model.py", line 71, in train
preds = self.model(sent_id, mask)
File "E:\BertTorch\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\BertTorch\model.py", line 181, in forward
#pass the inputs to the model
File "E:\BertTorch\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\BertTorch\venv\lib\site-packages\transformers\modeling_bert.py", line 837, in forward
embedding_output = self.embeddings(
File "E:\BertTorch\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\BertTorch\venv\lib\site-packages\transformers\modeling_bert.py", line 201, in forward
embeddings = inputs_embeds + position_embeddings + token_type_embeddings
RuntimeError: The size of tensor a (4000) must match the size of tensor b (512) at non-singleton dimension 1
如果您注意到我在代码中打印了手电筒尺寸。
print(sent_id.size(), mask.size())
该行代码的输出是torch.Size([32, 4000]) torch.Size([32, 4000])
。
正如我们所见,大小相同但会引发错误。请说出你的想法。非常感谢。
如果您需要更多信息,请发表评论。我会尽快添加所需的内容。
问题是关于 BERT 的字数限制。我已经将字数传递为 4000,其中支持的最大值为 512(必须在字符串的开头和结尾为 '[cls]' 和 '[Sep]' 再放弃 2 个,因此它仅为 510) .减少字数或使用其他模型来解决您的问题。 @cronoik 在上面的评论中建议的类似 Longformers 的内容。
谢谢。
我正在尝试构建文档分类模型。我正在使用 BERT
和 PyTorch
。
我用下面的代码得到了bert模型。
bert = AutoModel.from_pretrained('bert-base-uncased')
这是训练代码。
for epoch in range(epochs):
print('\n Epoch {:} / {:}'.format(epoch + 1, epochs))
#train model
train_loss, _ = modhelper.train(proc.train_dataloader)
#evaluate model
valid_loss, _ = modhelper.evaluate()
#save the best model
if valid_loss < best_valid_loss:
best_valid_loss = valid_loss
torch.save(modhelper.model.state_dict(), 'saved_weights.pt')
# append training and validation loss
train_losses.append(train_loss)
valid_losses.append(valid_loss)
print(f'\nTraining Loss: {train_loss:.3f}')
print(f'Validation Loss: {valid_loss:.3f}')
这是我的训练方法,可通过对象 modhelper
.
def train(self, train_dataloader):
self.model.train()
total_loss, total_accuracy = 0, 0
# empty list to save model predictions
total_preds=[]
# iterate over batches
for step, batch in enumerate(train_dataloader):
# progress update after every 50 batches.
if step % 50 == 0 and not step == 0:
print(' Batch {:>5,} of {:>5,}.'.format(step, len(train_dataloader)))
# push the batch to gpu
#batch = [r.to(device) for r in batch]
sent_id, mask, labels = batch
# clear previously calculated gradients
self.model.zero_grad()
print(sent_id.size(), mask.size())
# get model predictions for the current batch
preds = self.model(sent_id, mask) #This line throws the error
# compute the loss between actual and predicted values
self.loss = self.cross_entropy(preds, labels)
# add on to the total loss
total_loss = total_loss + self.loss.item()
# backward pass to calculate the gradients
self.loss.backward()
# clip the the gradients to 1.0. It helps in preventing the exploding gradient problem
torch.nn.utils.clip_grad_norm_(self.model.parameters(), 1.0)
# update parameters
self.optimizer.step()
# model predictions are stored on GPU. So, push it to CPU
#preds=preds.detach().cpu().numpy()
# append the model predictions
total_preds.append(preds)
# compute the training loss of the epoch
avg_loss = total_loss / len(train_dataloader)
# predictions are in the form of (no. of batches, size of batch, no. of classes).
# reshape the predictions in form of (number of samples, no. of classes)
total_preds = np.concatenate(total_preds, axis=0)
#returns the loss and predictions
return avg_loss, total_preds
preds = self.model(sent_id, mask)
此行抛出以下错误(包括完整回溯)。
Epoch 1 / 1
torch.Size([32, 4000]) torch.Size([32, 4000])
Traceback (most recent call last):
File "<ipython-input-39-17211d5a107c>", line 8, in <module>
train_loss, _ = modhelper.train(proc.train_dataloader)
File "E:\BertTorch\model.py", line 71, in train
preds = self.model(sent_id, mask)
File "E:\BertTorch\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\BertTorch\model.py", line 181, in forward
#pass the inputs to the model
File "E:\BertTorch\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\BertTorch\venv\lib\site-packages\transformers\modeling_bert.py", line 837, in forward
embedding_output = self.embeddings(
File "E:\BertTorch\venv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\BertTorch\venv\lib\site-packages\transformers\modeling_bert.py", line 201, in forward
embeddings = inputs_embeds + position_embeddings + token_type_embeddings
RuntimeError: The size of tensor a (4000) must match the size of tensor b (512) at non-singleton dimension 1
如果您注意到我在代码中打印了手电筒尺寸。
print(sent_id.size(), mask.size())
该行代码的输出是torch.Size([32, 4000]) torch.Size([32, 4000])
。
正如我们所见,大小相同但会引发错误。请说出你的想法。非常感谢。
如果您需要更多信息,请发表评论。我会尽快添加所需的内容。
问题是关于 BERT 的字数限制。我已经将字数传递为 4000,其中支持的最大值为 512(必须在字符串的开头和结尾为 '[cls]' 和 '[Sep]' 再放弃 2 个,因此它仅为 510) .减少字数或使用其他模型来解决您的问题。 @cronoik 在上面的评论中建议的类似 Longformers 的内容。
谢谢。