首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >将BERT模型转换为TFLite

将BERT模型转换为TFLite
EN

Stack Overflow用户
提问于 2020-04-01 17:37:40
回答 3查看 2.3K关注 0票数 3

我有这个代码的语义搜索引擎建立使用预先训练的bert模型。我想将此模型转换为tflite,以便将其部署到google mlkit。我想知道如何转换它。我想知道是否有可能将其转换为tflite。这可能是因为它在tensorflow官方网站上提到了:https://www.tensorflow.org/lite/convert。但我不知道从何说起

代码:

代码语言:javascript
复制
from sentence_transformers import SentenceTransformer

# Load the BERT model. Various models trained on Natural Language Inference (NLI) https://github.com/UKPLab/sentence-transformers/blob/master/docs/pretrained-models/nli-models.md and 
# Semantic Textual Similarity are available https://github.com/UKPLab/sentence-transformers/blob/master/docs/pretrained-models/sts-models.md

model = SentenceTransformer('bert-base-nli-mean-tokens')

# A corpus is a list with documents split by sentences.

sentences = ['Absence of sanity', 
             'Lack of saneness',
             'A man is eating food.',
             'A man is eating a piece of bread.',
             'The girl is carrying a baby.',
             'A man is riding a horse.',
             'A woman is playing violin.',
             'Two men pushed carts through the woods.',
             'A man is riding a white horse on an enclosed ground.',
             'A monkey is playing drums.',
             'A cheetah is running behind its prey.']

# Each sentence is encoded as a 1-D vector with 78 columns
sentence_embeddings = model.encode(sentences)

print('Sample BERT embedding vector - length', len(sentence_embeddings[0]))

print('Sample BERT embedding vector - note includes negative values', sentence_embeddings[0])

#@title Sematic Search Form

# code adapted from https://github.com/UKPLab/sentence-transformers/blob/master/examples/application_semantic_search.py

query = 'Nobody has sane thoughts' #@param {type: 'string'}

queries = [query]
query_embeddings = model.encode(queries)

# Find the closest 3 sentences of the corpus for each query sentence based on cosine similarity
number_top_matches = 3 #@param {type: "number"}

print("Semantic Search Results")

for query, query_embedding in zip(queries, query_embeddings):
    distances = scipy.spatial.distance.cdist([query_embedding], sentence_embeddings, "cosine")[0]

    results = zip(range(len(distances)), distances)
    results = sorted(results, key=lambda x: x[1])

    print("\n\n======================\n\n")
    print("Query:", query)
    print("\nTop 5 most similar sentences in corpus:")

    for idx, distance in results[0:number_top_matches]:
        print(sentences[idx].strip(), "(Cosine Score: %.4f)" % (1-distance))
EN

回答 3

Stack Overflow用户

发布于 2020-04-02 16:11:48

首先,你需要用TensorFlow编写你的模型,你使用的包是用PyTorch写的。Huggingface的Transformers有TensorFlow模型,你可以从它开始。此外,他们还有安卓版的TFLite-ready models

通常,您首先要有一个TensorFlow模型。将其保存为SavedModel格式:

代码语言:javascript
复制
tf.saved_model.save(pretrained_model, "/tmp/pretrained-bert/1/")

你可以在上面运行这个转换器。

票数 0
EN

Stack Overflow用户

发布于 2020-04-13 23:18:48

你有没有试过运行转换工具(tflite_convert),它有什么问题吗?

顺便说一句,你可能想看看使用Bert模型的TFLite团队的QA示例。https://github.com/tensorflow/examples/tree/master/lite/examples/bert_qa/android

票数 0
EN

Stack Overflow用户

发布于 2020-08-06 23:12:51

我找不到任何关于使用BERT模型在移动设备上获取文档嵌入并计算k最近文档搜索的信息,如您的示例所示。这可能也不是一个好主意,因为BERT模型的执行成本很高,并且有大量的参数,因此模型文件大小(400mb+)也很大。

然而,you can now use BERT和MobileBERT用于文本分类和移动问答。也许你可以从他们的demo app开始,它与MobileBERT的tflite模型接口,正如迅开提到的那样。我相信在不久的将来,会有更好的支持你的用例。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60967842

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档