首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用Python组合正则表达式

用Python组合正则表达式
EN

Stack Overflow用户
提问于 2016-06-15 08:45:24
回答 3查看 89关注 0票数 0

我是正则表达式的新手。

我正在尝试获取svstat命令中向上或向下的服务列表。

svstat的示例输出:

代码语言:javascript
复制
/etc/service/worker-test-1: up (pid 1234) 97381 seconds
/etc/service/worker-test-2: up (pid 4567) 92233 seconds
/etc/service/worker-test-3: up (pid 8910) 97381 seconds
/etc/service/worker-test-4: down 9 seconds, normally up
/etc/service/worker-test-5: down 9 seconds, normally up
/etc/service/worker-test-6: down 9 seconds, normally up

因此,目前我需要2 regex来过滤服务的上升或下降。

UP的示例regex-1:

代码语言:javascript
复制
/etc/service/(?P<service_name>.+):\s(?P<status>up|down)\s\(pid\s(?P<pid>\d+)\)\s(?P<seconds>\d+)

regex-1的产出:

代码语言:javascript
复制
Match 1
status -> up
service_name -> worker-test-1
pid -> 1234
seconds -> 97381

Match 2
status -> up
service_name -> worker-test-2
pid -> 4567
seconds -> 92233

Match 3
status -> up
service_name -> worker-test-3
pid -> 8910
seconds -> 97381

向下的样本regex-2

代码语言:javascript
复制
/etc/service/(?P<service_name>.+):\s(?P<status>up|down)\s(?P<seconds>\d+)

regex-2的输出

代码语言:javascript
复制
Match 1
status -> down
service_name -> worker-test-4
seconds -> 9

Match 2
status -> down
service_name -> worker-test-5
seconds -> 9

Match 3
status -> down
service_name -> worker-test-6
seconds -> 9

问题是,如何只使用一个正则表达式就可以得到上下两个方向?

顺便说一下,我使用http://pythex.org/来创建和测试这些正则表达式。

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2016-06-15 09:01:48

您可以将pid封装到可选的非捕获组:

代码语言:javascript
复制
/etc/service/(?P<service_name>.+):\s(?P<status>up|down)(?:\s\(pid\s(?P<pid>\d+)\))?\s(?P<seconds>\d+)

如果服务中断,这将导致pid成为None。请参阅Regex101演示。

票数 1
EN

Stack Overflow用户

发布于 2016-06-15 09:36:58

正如这里所承诺的,我的午休替代方案(不想讨论固定的令牌拆分解析,但在考虑只有OP知道的其他用例时可能会派上用场;-)

代码语言:javascript
复制
#! /usr/bin/env python
from __future__ import print_function

d = """
/etc/service/worker-test-1: up (pid 1234) 97381 seconds
/etc/service/worker-test-2: up (pid 4567) 92233 seconds
/etc/service/worker-test-3: up (pid 8910) 97381 seconds
/etc/service/worker-test-4: down 9 seconds, normally up
/etc/service/worker-test-5: down 9 seconds, normally up
/etc/service/worker-test-6: down 9 seconds, normally up
"""


def service_state_parser_gen(text_lines):
    """Parse the lines from service monitor by splitting
    on well known binary condition (either up or down)
    and parse the rest of the fields based on fixed
    position split on sanitized data (in the up case).
    yield tuple of key and dictionary as result or of
    None, None when neihter up nor down detected."""

    token_up = ': up '
    token_down = ': down '
    path_sep = '/'

    for line in d.split('\n'):
        if token_up in line:
            chunks = line.split(token_up)
            status = token_up.strip(': ')
            service = chunks[0].split(path_sep)[-1]
            _, pid, seconds, _ = chunks[1].replace(
                '(', '').replace(')', '').split()
            yield service, {'name': service,
                            'status': status,
                            'pid': int(pid),
                            'seconds': int(seconds)}
        elif token_down in line:
            chunks = line.split(token_down)
            status = token_down.strip(': ')
            service = chunks[0].split(path_sep)[-1]
            pid = None
            seconds, _, _, _ = chunks[1].split()
            yield service, {'name': service,
                            'status': status,
                            'pid': None,
                            'seconds': int(seconds)}
        else:
            yield None, None


def main():
    """Sample driver for parser generator function."""

    services = {}
    for key, status_map in service_state_parser_gen(d):
        if key is None:
            print("Non-Status line ignored.")
        else:
            services[key] = status_map

    print(services)

if __name__ == '__main__':
    main()

在运行时,它会对给定的示例输入产生结果:

代码语言:javascript
复制
Non-Status line ignored.
Non-Status line ignored.
{'worker-test-1': {'status': 'up', 'seconds': 97381, 'pid': 1234, 'name': 'worker-test-1'}, 'worker-test-3': {'status': 'up', 'seconds': 97381, 'pid': 8910, 'name': 'worker-test-3'}, 'worker-test-2': {'status': 'up', 'seconds': 92233, 'pid': 4567, 'name': 'worker-test-2'}, 'worker-test-5': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-5'}, 'worker-test-4': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-4'}, 'worker-test-6': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-6'}}

因此,在指定组中的其他类型匹配存储的信息将被存储(在dict中匹配的键下已经将类型转换为值)。如果服务被关闭,当然没有进程id,因此pid被映射到None,这样就很容易以健壮的方式对它进行编码(如果要将所有的服务存储在一个单独的隐式结构中,则不建议访问pid字段.

希望能帮上忙。PS:是的,演示函数的参数名text_lines并不是最优的名称,因为它包含什么,但是您应该得到解析的想法。

票数 1
EN

Stack Overflow用户

发布于 2016-06-15 09:21:12

我不知道您是否被迫使用regex,但如果不必使用regex,您可以这样做:

代码语言:javascript
复制
if "down" in linetext:
    print( "is down" )
else:
    print( "is up" )

更容易阅读,也更快。

票数 -1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/37830404

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档