使用 HuggingFace 转换器,我如何在生成文本时 return 多个样本?
With the HuggingFace transformer, how can I return multiple samples when generating text?
我要离开 https://github.com/cortexlabs/cortex/blob/master/examples/pytorch/text-generator/predictor.py
但是如果我通过 num_samples=5
,我得到:
generated = torch.cat((generated, next_token.unsqueeze(0)), dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Got 5 and 1 in dimension 0
密码是:
def sample_sequence(
model,
length,
context,
num_samples=1,
temperature=1,
top_k=0,
top_p=0.9,
repetition_penalty=1.0,
device="cpu",
):
context = torch.tensor(context, dtype=torch.long, device=device)
context = context.unsqueeze(0).repeat(num_samples, 1)
print('context_size', context.shape)
generated = context
print('context', context)
with torch.no_grad():
for _ in trange(length):
inputs = {"input_ids": generated}
outputs = model(
**inputs
) # Note: we could also use 'past' with GPT-2/Transfo-XL/XLNet/CTRL (cached hidden-states)
next_token_logits = outputs[0][0, -1, :] / (temperature if temperature > 0 else 1.0)
# reptition penalty from CTRL (https://arxiv.org/abs/1909.05858)
for _ in set(generated.view(-1).tolist()):
next_token_logits[_] /= repetition_penalty
filtered_logits = top_k_top_p_filtering(next_token_logits, top_k=top_k, top_p=top_p)
if temperature == 0: # greedy sampling:
next_token = torch.argmax(filtered_logits).unsqueeze(0)
else:
next_token = torch.multinomial(F.softmax(filtered_logits, dim=-1), num_samples=1)
generated = torch.cat((generated, next_token.unsqueeze(0)), dim=1)
return generated
据我所知,这段代码没有提供多个示例,但您可以通过一些调整来调整它。
这一行已经使用了多项式,但 return 仅 1:
next_token = torch.multinomial(F.softmax(filtered_logits, dim=-1), num_samples=1)
改为:
next_token = torch.multinomial(F.softmax(filtered_logits, dim=-1), num_samples=num_samples)
现在您还需要更改结果构造。这将 next_token 行与句子连接起来。你现在得到了 num_samples
个 next_token 并且你需要全部解压:
generated = torch.cat((generated, next_token.unsqueeze(0)), dim=1)
改为:
generated = torch.cat((generated, next_token.unsqueeze(1)), dim=1)
整个函数现在应该如下所示:
def sample_sequence(
model,
length,
context,
num_samples=1,
temperature=1,
top_k=0,
top_p=0.9,
repetition_penalty=1.0,
device="cpu",
):
context = torch.tensor(context, dtype=torch.long, device=device)
context = context.unsqueeze(0).repeat(num_samples, 1)
generated = context
with torch.no_grad():
for _ in trange(length):
inputs = {"input_ids": generated}
outputs = model(
**inputs
) # Note: we could also use 'past' with GPT-2/Transfo-XL/XLNet/CTRL (cached hidden-states)
next_token_logits = outputs[0][0, -1, :] / (temperature if temperature > 0 else 1.0)
# reptition penalty from CTRL (https://arxiv.org/abs/1909.05858)
for _ in set(generated.view(-1).tolist()):
next_token_logits[_] /= repetition_penalty
filtered_logits = top_k_top_p_filtering(next_token_logits, top_k=top_k, top_p=top_p)
if temperature == 0: # greedy sampling:
next_token = torch.argmax(filtered_logits).unsqueeze(0)
else:
next_token = torch.multinomial(F.softmax(filtered_logits, dim=-1), num_samples=num_samples)
generated = torch.cat((generated, next_token.unsqueeze(1)), dim=1)
return generated
最后但同样重要的是,您必须将 tokenizer.decode 调用更改为 tokenizer.batch_decode,因为 return 值现在包含多个样本:
tokenizer.batch_decode(output.tolist(), clean_up_tokenization_spaces=True, skip_special_tokens=True)
你必须自己考虑的事情是,当没有有效的时候你想做什么next_token
。目前您将收到如下错误消息:
RuntimeError: invalid multinomial distribution (with replacement=False, not enough non-negative category to sample)
你还要考虑的另一件事是,他们的代码是否正确。在我进行的几次测试中,感觉造句的质量随着num_samples
的数量越来越多(即,当您使用简单的循环多次调用 sample_sequence 时,质量可能会更好?)。我还没有使用过 GPT2,所以这里帮不了你。
我要离开 https://github.com/cortexlabs/cortex/blob/master/examples/pytorch/text-generator/predictor.py
但是如果我通过 num_samples=5
,我得到:
generated = torch.cat((generated, next_token.unsqueeze(0)), dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Got 5 and 1 in dimension 0
密码是:
def sample_sequence(
model,
length,
context,
num_samples=1,
temperature=1,
top_k=0,
top_p=0.9,
repetition_penalty=1.0,
device="cpu",
):
context = torch.tensor(context, dtype=torch.long, device=device)
context = context.unsqueeze(0).repeat(num_samples, 1)
print('context_size', context.shape)
generated = context
print('context', context)
with torch.no_grad():
for _ in trange(length):
inputs = {"input_ids": generated}
outputs = model(
**inputs
) # Note: we could also use 'past' with GPT-2/Transfo-XL/XLNet/CTRL (cached hidden-states)
next_token_logits = outputs[0][0, -1, :] / (temperature if temperature > 0 else 1.0)
# reptition penalty from CTRL (https://arxiv.org/abs/1909.05858)
for _ in set(generated.view(-1).tolist()):
next_token_logits[_] /= repetition_penalty
filtered_logits = top_k_top_p_filtering(next_token_logits, top_k=top_k, top_p=top_p)
if temperature == 0: # greedy sampling:
next_token = torch.argmax(filtered_logits).unsqueeze(0)
else:
next_token = torch.multinomial(F.softmax(filtered_logits, dim=-1), num_samples=1)
generated = torch.cat((generated, next_token.unsqueeze(0)), dim=1)
return generated
据我所知,这段代码没有提供多个示例,但您可以通过一些调整来调整它。
这一行已经使用了多项式,但 return 仅 1:
next_token = torch.multinomial(F.softmax(filtered_logits, dim=-1), num_samples=1)
改为:
next_token = torch.multinomial(F.softmax(filtered_logits, dim=-1), num_samples=num_samples)
现在您还需要更改结果构造。这将 next_token 行与句子连接起来。你现在得到了 num_samples
个 next_token 并且你需要全部解压:
generated = torch.cat((generated, next_token.unsqueeze(0)), dim=1)
改为:
generated = torch.cat((generated, next_token.unsqueeze(1)), dim=1)
整个函数现在应该如下所示:
def sample_sequence(
model,
length,
context,
num_samples=1,
temperature=1,
top_k=0,
top_p=0.9,
repetition_penalty=1.0,
device="cpu",
):
context = torch.tensor(context, dtype=torch.long, device=device)
context = context.unsqueeze(0).repeat(num_samples, 1)
generated = context
with torch.no_grad():
for _ in trange(length):
inputs = {"input_ids": generated}
outputs = model(
**inputs
) # Note: we could also use 'past' with GPT-2/Transfo-XL/XLNet/CTRL (cached hidden-states)
next_token_logits = outputs[0][0, -1, :] / (temperature if temperature > 0 else 1.0)
# reptition penalty from CTRL (https://arxiv.org/abs/1909.05858)
for _ in set(generated.view(-1).tolist()):
next_token_logits[_] /= repetition_penalty
filtered_logits = top_k_top_p_filtering(next_token_logits, top_k=top_k, top_p=top_p)
if temperature == 0: # greedy sampling:
next_token = torch.argmax(filtered_logits).unsqueeze(0)
else:
next_token = torch.multinomial(F.softmax(filtered_logits, dim=-1), num_samples=num_samples)
generated = torch.cat((generated, next_token.unsqueeze(1)), dim=1)
return generated
最后但同样重要的是,您必须将 tokenizer.decode 调用更改为 tokenizer.batch_decode,因为 return 值现在包含多个样本:
tokenizer.batch_decode(output.tolist(), clean_up_tokenization_spaces=True, skip_special_tokens=True)
你必须自己考虑的事情是,当没有有效的时候你想做什么next_token
。目前您将收到如下错误消息:
RuntimeError: invalid multinomial distribution (with replacement=False, not enough non-negative category to sample)
你还要考虑的另一件事是,他们的代码是否正确。在我进行的几次测试中,感觉造句的质量随着num_samples
的数量越来越多(即,当您使用简单的循环多次调用 sample_sequence 时,质量可能会更好?)。我还没有使用过 GPT2,所以这里帮不了你。