首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >五台变压器组合建模

五台变压器组合建模
EN

Stack Overflow用户
提问于 2022-02-01 09:58:51
回答 1查看 711关注 0票数 0

我试图建立一个集成模型,为我的文本分类任务使用5变压器架构。但是由于不通晓python的概念,我不得不寻求帮助。到目前为止,我的模型如下:

代码语言:javascript
复制
class BERTClassA(torch.nn.Module):
  def init(self):
    super(BERTClassA, self).init()
    self.l1 = BertModel.from_pretrained('bert-base-uncased')
    self.pre_classifier = torch.nn.Linear(768, 768)
    self.dropout = torch.nn.Dropout(.3)
  
  def forward(self, input_ids, attention_mask):
    output_1 = self.l1(input_ids=input_ids, attention_mask=attention_mask)
    hidden_state = output_1[0]
    pooler = hidden_state[:, 0]
    pooler = self.pre_classifier(pooler)
    pooler = torch.nn.ReLU()(pooler)
    output = self.dropout(pooler)
    return output

class BERTClassB(torch.nn.Module):
  def init(self):
    super(BERTClassB, self).init()
    self.l2 = TFRobertaModel.from_pretrained('roberta-base')
    self.pre_classifier = torch.nn.Linear(768, 768)
    self.dropout = torch.nn.Dropout(.3)

  def forward(self, input_ids, attention_mask):
    output_2 = self.l2(input_ids=input_ids, attention_mask=attention_mask)
    hidden_state = output_2[0]
    pooler = hidden_state[:, 0]
    pooler = self.pre_classifier(pooler)
    pooler = torch.nn.ReLU()(pooler)
    output = self.dropout(pooler)
    return output

class BERTClassC(torch.nn.Module):
  def init(self):
    super(BERTClassC, self).init()
    self.l3 = XLNetForSequenceClassification.from_pretrained('xlnet-base-cased', num_labels = 2)
    self.pre_classifier = torch.nn.Linear(768, 768)
    self.dropout = torch.nn.Dropout(.3)

  def forward(self, input_ids, attention_mask):
    output_3 = self.l3(input_ids=input_ids, attention_mask=attention_mask)
    hidden_state = output_3[0]
    pooler = hidden_state[:, 0]
    pooler = self.pre_classifier(pooler)
    pooler = torch.nn.ReLU()(pooler)
    output = self.dropout(pooler)
    return output


class BERTClassD(torch.nn.Module):
  def init(self):
    super(BERTClassD, self).init()
    self.l4 = DistilBertModel.from_pretrained('distilbert-base-uncased',output_hidden_states=True)
    self.pre_classifier = torch.nn.Linear(768, 768)
    self.dropout = torch.nn.Dropout(.3)

  def forward(self, input_ids, attention_mask):
    output_4 = self.l4(input_ids=input_ids, attention_mask=attention_mask)
    hidden_state = output_4[0]
    pooler = hidden_state[:, 0]
    pooler = self.pre_classifier(pooler)
    pooler = torch.nn.ReLU()(pooler)
    output = self.dropout(pooler)
    return output




class BERTClassE(torch.nn.Module):
  def init(self):
    super(BERTClassE, self).init()
    self.l5 = ElectraForSequenceClassification.from_pretrained('google/electra-base-discriminator',num_labels=2,return_dict= True)
    self.pre_classifier = torch.nn.Linear(768, 768)
    self.dropout = torch.nn.Dropout(.3)

  def forward(self, input_ids, attention_mask):
    output_5 = self.l5(input_ids=input_ids, attention_mask=attention_mask)
    hidden_state = output_5[0]
    pooler = hidden_state[:, 0]
    pooler = self.pre_classifier(pooler)
    pooler = torch.nn.ReLU()(pooler)
    output = self.dropout(pooler)
    return output

我想将上面所有的类与下面的类结合起来:

代码语言:javascript
复制
class MyEnsemble(torch.nn.Module):
  def init(self, modelA, modelB,modelC,modelD,modelE):
    super(MyEnsemble, self).init()
    self.modelA = modelA
    self.modelB = modelB
    self.modelC= modelC
    self.modelD= modelD
    self.modelE= modelE
    self.classifier = torch.nn.Linear(768, 2)
  def forward(self, x1, x2,x3,x4,x5):
    x1 = self.modelA(x1)
    x2 = self.modelB(x2)
    x3 = self.modelC(x3)
    x4 = self.modelD(x4)
    x5 = self.modelE(x5)
    x = torch.cat((x1, x2, x3, x4, x5), dim=1)
    x = self.classifier(F.relu(x))
    return x

问题是,当我运行培训时代时,会出现以下错误:

前向()缺少3个需要的位置参数:'x3','x4‘和'x5’

我的训练时代职能如下:

代码语言:javascript
复制
def train_epoch(
  model, 
  data_loader, 
  loss_fn, 
  optimizer, 
  device, 
  scheduler, 
  n_examples
):
  model = model.train()

  losses = []
  correct_predictions = 0


  for d in data_loader:
    input_ids = d["input_ids"].to(device)
    attention_mask = d["attention_mask"].to(device)
    targets = d["targets"].to(device)

    outputs = model(
      input_ids,
      attention_mask
    )

    _, preds = torch.max(outputs, dim=1)
    loss = loss_fn(outputs, targets)

    correct_predictions += torch.sum(preds == targets)
    losses.append(loss.item())

    loss.backward()
    nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
    optimizer.step()
    scheduler.step()
    optimizer.zero_grad()

  return correct_predictions.double() / n_examples, np.mean(losses)

在调用MyEnsemble()类之前,我运行了以下命令:

代码语言:javascript
复制
modelA = BERTClassA()
modelB = BERTClassB()
modelC = BERTClassC()
modelD = BERTClassD()
modelE = BERTClassE()

最后,该模式如下:

代码语言:javascript
复制
model = MyEnsemble()
model.to(device)

你发现我的代码有什么问题吗?

EN

回答 1

Stack Overflow用户

发布于 2022-02-02 13:52:07

问题是,调用MyEnsemble函数时只有两个参数(引用代码):

代码语言:javascript
复制
outputs = model(
  input_ids,
  attention_mask
)

您必须重写集成的forward()函数,以便在模型之间安排/拆分所需的输入,并正确地提供它们。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/70938579

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档