

model Decoding

核心:Tranformer模型示意图

greedy decoding



MindNLP/LLaMa3/run_llama3.py
选择线性同余生成器(LCG)解码策略法,重新一下代码
import mindspore
from mindspore.communication import init
from mindnlp.transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "LLM-Research/Meta-Llama-3-8B-Instruct"
init()
tokenizer = AutoTokenizer.from_pretrained(model_id, mirror='modelscope')
model = AutoModelForCausalLM.from_pretrained(
model_id,
ms_dtype=mindspore.float16,
mirror='modelscope',
device_map="auto"
)
messages = [
{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
{"role": "user", "content": "Who are you?"},
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="ms"
)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
input_ids,
max_new_tokens=100,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))MindNLP/LLaMa3/run_llama3_LCG.py
以下是在原有代码基础上,引入线性同余生成器(LCG)解码策略的示例代码,由于MindSpore框架本身并没有直接支持LCG解码策略,因此需要手动实现相关逻辑:
Python复制
import mindspore
from mindspore.communication import init
from mindnlp.transformers import AutoTokenizer, AutoModelForCausalLM
import numpy as np
# LCG 参数设置
a = 1664525
c = 1013904223
m = 2**32
# 初始化种子
seed = 1
def lcg_generator(a, c, m, seed):
while True:
seed = (a * seed + c) % m
yield seed
# 创建LCG生成器
lcg_gen = lcg_generator(a, c, m, seed)
model_id = "LLM-Research/Meta-Llama-3-8B-Instruct"
init()
tokenizer = AutoTokenizer.from_pretrained(model_id, mirror='modelscope')
model = AutoModelForCausalLM.from_pretrained(
model_id,
ms_dtype=mindspore.float16,
mirror='modelscope',
device_map="auto"
)
messages = [
{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
{"role": "user", "content": "Who are you?"},
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="ms"
)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
# 获取初始输入长度
input_len = input_ids.shape[-1]
# 初始化输出
outputs = input_ids
# 使用LCG解码策略生成文本
for _ in range(100): # 生成100个新token
# 获取当前输入的logits
logits = model(outputs).logits[:, -1, :]
# 使用LCG生成伪随机数作为采样依据
random_seed = next(lcg_gen)
np.random.seed(random_seed)
# 根据logits和随机数进行采样
probs = mindspore.ops.Softmax()(logits)
probs = probs.asnumpy()
next_token = np.random.choice(len(probs[0]), p=probs[0])
# 添加生成的token到输出中
next_token = mindspore.Tensor([next_token], dtype=mindspore.int64).reshape(1, 1)
outputs = mindspore.ops.Concat(axis=-1)((outputs, next_token))
# 判断是否遇到终止符
if next_token in terminators:
break
response = outputs[0][input_len:]
print(tokenizer.decode(response, skip_special_tokens=True))在上述代码中,通过定义LCG生成器函数lcg_generator,并将其应用于文本生成的每一步,利用生成的伪随机数来指导token的采样,从而实现基于LCG解码策略的LLaMa文本生成任务。需要注意的是,由于引入了随机性,生成的文本结果可能会有所不同。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。