比较以下搜索查询:
SELECT
to_tsvector('yellow-green') @@ to_tsquery('yellow & green') as word_word,
to_tsvector('Apollo-11') @@ to_tsquery('apollo & 11') as word_number;
word_word | word_number
-----------+-------------
t | f
(1 row)它们在概念上是相似的,但只有第一个产生匹配。很清楚为什么会发生这种情况。使用连字符对,解析器生成三个词素:
SELECT * FROM ts_debug('yellow-green');
alias | description | token | dictionaries | dictionary | lexemes
----------------+------------------------------+---------+----------------+-------------+-----------
asciihword | Hyphenated word, all ASCII | yellow-.| {english_stem} | english_ste.| {yellow-g.
| |.green | |.m |.reen}
hword_asciipar.| Hyphenated word part, all AS.| yellow | {english_stem} | english_ste.| {yellow}
.t |.CII | | |.m |
blank | Space symbols | - | {} | |
hword_asciipar.| Hyphenated word part, all AS.| green | {english_stem} | english_ste.| {green}
.t |.CII | | |.m |
(4 rows)使用一个字号对,解析器将创建两个词汇,其中一个是有符号整数:
SELECT * FROM ts_debug('apollo-11');
alias | description | token | dictionaries | dictionary | lexemes
-----------+-----------------+--------+----------------+--------------+----------
asciiword | Word, all ASCII | apollo | {english_stem} | english_stem | {apollo}
int | Signed integer | -11 | {simple} | simple | {-11}
(2 rows)是否可以将to_tsvector配置为像字符串一样解析数字,从而使匹配不受连字符的影响?
发布于 2022-12-16 08:31:46
我最后使用了dict_int扩展,正如这个答案在StackOverflow上所提到的。它很简单,如:
CREATE EXTENSION IF NOT EXISTS dict_int;absval选项,以便字典将数字转换为它们的绝对值。ALTER TEXT SEARCH DICTIONARY intdict (absval = true);maxlen参数控制的某个最大长度的整数(默认为6)。CREATE TEXT SEARCH CONFIGURATION en_cusom (COPY = pg_catalog.english);
ALTER TEXT SEARCH CONFIGURATION en_custom
ALTER MAPPING FOR int, uint WITH intdict;to_tsvector/to_tsquery中的显式调用。SELECT
to_tsvector('en_custom', 'yellow-green') @@ to_tsquery('en_custom', 'yellow & green') as word_word,
to_tsvector('en_custom', 'Apollo-11') @@ to_tsquery('en_custom', 'apollo & 11') as word_number;
word_word | word_number
-----------+-------------
t | t
(1 row)https://dba.stackexchange.com/questions/320848
复制相似问题