文章/答案/技术大牛

发布

社区首页 >问答首页 >正则表达式按任何顺序匹配单词，但单词可以是可选的。

问正则表达式按任何顺序匹配单词，但单词可以是可选的。
EN

Stack Overflow用户

提问于 2022-02-22 00:13:24

回答 1查看 130关注 0票数 1

我四处看了很长时间，找不到满足我需求的正则表达式

我有多行文字，如下所示：

male positive average
average negative female
good negative female
female bad
male average
male
female
...
...

在上面的例子中，有三组单词(男性，女性)，(好的，平均的，坏的)和(阳性的，阴性的)

我想在一个命名组中捕获每一组单词:性别、质量和反馈。

我最近接触到的是：

(?=.*(?P<gender>\b(fe)?male\b))(?=.*(?P<quality>(green|amber|red)))(?=.*(?P<feedback>(positive|negative))).*

这与群体相匹配:性别、质量和反馈按任何顺序排列。

，但它不匹配或/并为下列类型的句子创建一个命名组：

female green
positive male
positive female
female bad
male average
male
female

注：性别(男、女)很常见，每一行都有。另外，为了简单起见，这里只提到了三个不同的组。基于需求，它甚至可以增长更多。

任何帮助都将不胜感激。

regex-group

python

regex

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-02-22 03:06:14

您需要将正则表达式锚定在行的开头(^)，并使每个包含命名捕获组的正面外观都是可选的。

另外，您有一些编号的捕获组，它们可能是非捕获组，这不会让人感到困惑，因为您只对命名的捕获组感兴趣。最后，你错过了一些词的界限.

我建议您将表达式更改为以下内容。

^(?=.*(?P<gender>\b(?:fe)?male\b))?(?=.*(?P<quality>\b(?:green|amber|red)\b))?(?=.*(?P<feedback>\b(?:positive|negative)\b))?.*

演示

正则表达式可以细分如下。

^                         # match beginning of line
(?=                       # begin positive lookahead
  .*                      # match zero or more characters
  (?P<gender>             # begin named capture group 'gender'
    \b                    # match a word boundary
    (?:female|male)       # one of the two words 
    \b                    # match a word boundary
  )                       # end capture group 'gender'
)?                        # end positive lookahead and make it optional

(?=                       # begin positive lookahead
  .*                      # match zero or more characters
  (?P<quality>            # begin named capture group 'quality'
    \b                    # match a word boundary
    (?:green|amber|red)   # match one of the three words
    \b                    # match a word boundary
  )                       # end named capture group 'quality'
)?                        # end positive lookahead and make it optional

(?=                       # begin positive lookahead
  .*                      # match zero or more characters
  (?P<feedback>           # begin named capture group 'feedback'    
    \b                    # match a word boundary
    (?:positive|negative) # match one of the two words
    \b                    # match a word boundary
  )                       # end named capture group 'feedback'
)?                        # end positive lookahead and make it
.*                        # match zero or more characters (the line)

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/71214482

复制

相似问题

问正则表达式按任何顺序匹配单词，但单词可以是可选的。
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问正则表达式按任何顺序匹配单词，但单词可以是可选的。EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问正则表达式按任何顺序匹配单词，但单词可以是可选的。
EN