问智能中文分析Elasticsearch返回unicodes
EN

Stack Overflow用户

提问于 2015-12-14 19:44:46

回答 1查看 113关注 0票数 0

我试图使用智能中文分析器分析Elasticsearch中的文档，但是，Elasticsearch返回的不是被分析的中文字符，而是这些字符的unicode。例如：

PUT /test_chinese
{
    "settings": {
        "index": {
            "analysis": {
                "analyzer": {
                    "default": {
                        "type": "smartcn"
                     }
                 }
             }
         }
     }
}

GET /test_chinese/_analyze?text='我说世界好!'

我希望得到每个中文字符，但我得到了：

{
    "tokens": [
      {
          "token": "25105",
          "start_offset": 3,
          "end_offset": 8,
          "type": "word",
          "position": 4
      },
      {
          "token": "35828",
          "start_offset": 11,
          "end_offset": 16,
          "type": "word",
          "position": 8
      },
      {
          "token": "19990",
          "start_offset": 19,
          "end_offset": 24,
          "type": "word",
          "position": 12
      },
      {
          "token": "30028",
          "start_offset": 27,
          "end_offset": 32,
          "type": "word",
          "position": 16
      },
      {
          "token": "22909",
          "start_offset": 35,
          "end_offset": 40,
          "type": "word",
          "position": 20
      }
   ]
}

你知道发生了什么事吗？

谢谢!

elasticsearch

回答 1

Stack Overflow用户

发布于 2015-12-15 22:12:49

我找到了关于我的问题的问题。似乎在Sense中有一个bug。在这里你可以找到与Zachary童的对话，Elasticsearch开发人员：https://discuss.elastic.co/t/smart-chinese-analysis-returns-unicodes-instead-of-chinese-tokens/37133这是发现的错误的罚单：https://github.com/elastic/sense/issues/88

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/34266236

复制

相似问题

问智能中文分析Elasticsearch返回unicodes
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问智能中文分析Elasticsearch返回unicodesEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问智能中文分析Elasticsearch返回unicodes
EN