文章/答案/技术大牛

发布

社区首页 >问答首页 >Tensorflow推荐-作为查询传递嵌入的ScaNN

问Tensorflow推荐-作为查询传递嵌入的ScaNN
EN

Stack Overflow用户

提问于 2022-11-21 01:26:21

回答 1查看 13关注 0票数 0

我想把一个查询嵌入到ScaNN而不是一个模型中，我应该为此使用什么数据类型？

我的查询将类似于下面的1，0.3，0.4 --我的候选嵌入类似于：[0.2，1，.4，0.3,0.1,0.56]

我看到的所有示例都是传递一个查询模型，而不是嵌入本身。

我试着传递一个numpy数组，但是它没有工作

tensorflow2.0

scann

回答 1

Stack Overflow用户

发布于 2022-11-21 01:55:27

嵌入只是你的模型产生的向量列表。在本例中，使用tf.keras.layers.Embedding层。

self._embeddings = {}
# Compute embeddings for string features
for feature_name in str_features:
  vocabulary = vocabularies[feature_name]
  self._embeddings[feature_name] = tf.keras.Sequential([
      tf.keras.layers.StringLookup(
          vocabulary=vocabulary, mask_token=None),
       tf.keras.layers.Embedding(len(vocabulary) + 1,
                                 self.embedding_dimension)
])

您还可以使用另一个模型(如句子转换器)来创建嵌入。

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
embeddings = model.encode(sentences)
print(embeddings)

不需要将模型传递给ScaNN，您可以直接将嵌入传递给它，也可以将文档这里中提到的嵌入传递给它

下面是一个示例代码片段，介绍如何将嵌入直接传递给scann

import pandas as pd
from sklearn import preprocessing, metrics

df = pd.read_csv("./data/mydata.csv")

# normalization
df_np = preprocessing.normalize(df.iloc[:,1:], norm=norm)


num_neighbors = 100

# creating searcher
k = int(np.sqrt(df_np.shape[0]))
searcher = scann.scann_ops_pybind.builder(df_np, num_neighbors, "dot_product").tree(
    num_leaves=k, 
    num_leaves_to_search=int(k/20), 
    training_sample_size=2500).score_brute_force(2).reorder(7).build()

这里有一篇关于使用ScaNN的博客文章

ScaNN优化与配置

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/74513321

复制

相似问题

问Tensorflow推荐-作为查询传递嵌入的ScaNN
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Tensorflow推荐-作为查询传递嵌入的ScaNNEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Tensorflow推荐-作为查询传递嵌入的ScaNN
EN