我的问题是转换一个这样的字符串。
"a OR (b AND c)"
into
a OR bc如果表达式类似于
"a AND (b OR c)"
then gives
ab OR ac我无法使用REGEX匹配设计一组正确的循环。问题的症结在于,代码必须是完全通用的,因为我不能假定字符串模式的长度,也不能确定OR和in模式的确切位置。
如果我像这样输入,它也会解决这类表达式。
发布于 2019-01-17 08:06:11
海事组织,您需要在这里使用解析器,例如PLY。您需要定义所有的积木,然后可以构建一个语法树,您可以用它来做任何您想做的事情。
一个例子可以是:
import ply.lex as lex
# List of token names. This is always required
tokens = (
'VARIABLE',
'WHITESPACE',
'OR',
'AND',
'NOT',
'PAR_OPEN',
'PAR_CLOSE',
)
# Regular expression rules for simple tokens
t_VARIABLE = r'\b[a-z]+\b'
t_WHITESPACE = r'\s+'
t_OR = r'\bOR\b'
t_AND = r'\bAND\b'
t_NOT = r'\bNOT\b'
t_PAR_OPEN = r'\('
t_PAR_CLOSE = r'\)'
def t_error(t):
print("Illegal character '%s'" % t.value[0])
t.lexer.skip(1)
# Build the lexer
lexer = lex.lex()
lexer.input("a OR (b AND c)")
while True:
token = lexer.token()
if not token:
break
else:
print(token)这会让
LexToken(VARIABLE,'a',1,0)
LexToken(WHITESPACE,' ',1,1)
LexToken(OR,'OR',1,2)
LexToken(WHITESPACE,' ',1,4)
LexToken(PAR_OPEN,'(',1,5)
LexToken(VARIABLE,'b',1,6)
LexToken(WHITESPACE,' ',1,7)
LexToken(AND,'AND',1,8)
LexToken(WHITESPACE,' ',1,11)
LexToken(VARIABLE,'c',1,12)
LexToken(PAR_CLOSE,')',1,13)它甚至可以使用嵌套括号,然后您可以分析较小的部分(例如,从PAR_OPEN到PAR_CLOSE等)。
发布于 2019-01-17 13:43:18
notation解析使为这种表示法定义表达式解析器变得很容易。将您的和、OR等关键字看作是运算符,将您的术语如“安全性”等作为infix表示法语法中的操作数来考虑,您可以使用parser的infixNotation语法生成器来定义解析器:
sample = "security OR ((internet OR online OR paperless) AND (bank*)) AND (mobile OR cell OR phone OR access) OR easy OR online WITHIN bank OR transaction OR mumbai OR delhi NEAR/10 agar OR (online OR internet) AND (bank) OR not OR (apple) EXCLUDE (mongo)"
import pyparsing as pp
# enable packrat parsing, since this infix notation gets more complex than usual
pp.ParserElement.enablePackrat()
term = pp.Word(pp.alphas + '*')
SLASH = pp.Suppress('/')
AND = pp.Keyword("AND")
OR = pp.Keyword("OR")
WITHIN = pp.Keyword("WITHIN")
EXCLUDE = pp.Keyword("EXCLUDE")
NEAR_op = pp.Group(pp.Keyword("NEAR") + SLASH + pp.pyparsing_common.integer)
expr = pp.infixNotation(term,
[
(NEAR_op, 2, pp.opAssoc.LEFT),
(WITHIN, 2, pp.opAssoc.LEFT),
(AND, 2, pp.opAssoc.LEFT),
(OR, 2, pp.opAssoc.RIGHT),
(EXCLUDE, 2, pp.opAssoc.LEFT),
])
expr.parseString(sample).pprint()指纹:
[[['security',
'OR',
[[[['internet', 'OR', ['online', 'OR', 'paperless']], 'AND', 'bank*'],
'AND',
['mobile', 'OR', ['cell', 'OR', ['phone', 'OR', 'access']]]],
'OR',
['easy',
'OR',
[['online', 'WITHIN', 'bank'],
'OR',
['transaction',
'OR',
['mumbai',
'OR',
[['delhi', ['NEAR', 10], 'agar'],
'OR',
[[['online', 'OR', 'internet'], 'AND', 'bank'],
'OR',
['not', 'OR', 'apple']]]]]]]]],
'EXCLUDE',
'mongo']](免责声明:我是author解析的作者。)
GitHub页面:https://github.com/pyparsing/pyparsing
博士:https://pyparsing-docs.readthedocs.io/en/latest/
安装:pip install pyparsing
https://stackoverflow.com/questions/54230139
复制相似问题