首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Regex制作智能数据

Regex制作智能数据
EN

Stack Overflow用户
提问于 2016-04-21 17:14:00
回答 3查看 67关注 0票数 0

我绞尽脑汁想出一个能在这个智能数据输出中提取我想要的数据的正则表达式:

代码语言:javascript
复制
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (  139) seconds.
Offline data collection
capabilities:            (0x73) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    ( 100) minutes.
Conveyance self-test routine
recommended polling time:    (   3) minutes.
SCT capabilities:          (0x1081) SCT Status supported.

到目前为止,我想出的理由是:

代码语言:javascript
复制
/([^A-Za-z]?:)([\w\s\/().\-]+\.)/gm

我的正则表达式的目标是从smartctl -a输出中获取每个“通用智能值”的“值”。问题是输出是以一种特殊的方式格式化的,这使我很难将我想要的值拉到数组中。

我能够只提取智能值键(如Offline data collection statusSelf-test execution status ),所以现在我正在努力拉出每个参数的值。比如(139) seconds或者(0x00) Offline data collection activity was never started.

键和值之间的区别是这个冒号后面有一些空格。然而,在其中一个值中,它包含的文本中也有一个冒号,这使得捕获非常困难。我需要捕获以下所有内容,而不需要意外捕获下一个参数值。

代码语言:javascript
复制
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (  139) seconds.

因此,从上面,我需要捕获以下内容。

代码语言:javascript
复制
(0x00)  Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.

没有进入并捕获Self-test execution status:作为它的一部分,因为这是下一个参数键。

任何想法对这种情况的帮助都是有帮助的。

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2016-04-21 18:09:57

我想你可以利用钥匙从一开始就开始

,而值总是至少有一个水平空格。

在每个人之前。

(?m)((?:^(?!\s)[^:\n]*\n?)+):(\h+.*?(?:\n|\z)(?:^\h+.*?(?:\n|\z))*)?

不需要修饰符,它包括在内。

代码语言:javascript
复制
while ( $smartdata =~ /(?m)((?:^(?!\s)[^:\n]*\n?)+):(\h+.*?(?:\n|\z)(?:^\h+.*?(?:\n|\z))*)?/g )
{
    push @key, $1;
    push @value, $2;
}

扩容

代码语言:javascript
复制
 (?m)
 (                             # (1 start), Key
      (?:
           ^ 
           (?! \s )
           [^:\n]* 
           \n? 
      )+
 )                             # (1 end)
 : 
 (                             # (2 start), Value
      \h+ .*?  
      (?: \n | \z )
      (?:
           ^ \h+ .*?  
           (?: \n | \z )
      )*
 )?                            # (2 end)

Perl样本

代码语言:javascript
复制
use strict;
use warnings;

$/ = undef;

my $smartdata = <DATA>;

my @key = ();
my @val = ();

while ( $smartdata =~ /(?m)((?:^(?!\s)[^:\n]*\n?)+):(\h+.*?(?:\n|\z)(?:^\h+.*?(?:\n|\z))*)?/g )
{
    push @key, $1;
    if (defined $2 ) {
        push @val, $2;
    }
    else {
        push @val, '';
    }
}

for ( 0 .. ($#key-1) )
{
     print "key $_ = $key[$_]\n";
     print "value = $val[$_]\n-------------------\n";
}

__DATA__

Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (  139) seconds.
Offline data collection
capabilities:            (0x73) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.



Extended self-test routine
recommended polling time:    ( 100) minutes.
Conveyance self-test routine
recommended polling time:    (   3) minutes.
SCT capabilities:          (0x1081) SCT Status supported.

输出

代码语言:javascript
复制
key 0 = Offline data collection status
value =   (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.

-------------------
key 1 = Self-test execution status
value =       (   0) The previous self-test routine completed
                    without error or no self-test has ever
                    been run.

-------------------
key 2 = Total time to complete Offline
data collection
value =         (  139) seconds.

-------------------
key 3 = Offline data collection
capabilities
value =             (0x73) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.

-------------------
key 4 = SMART capabilities
value =             (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.

-------------------
key 5 = Error logging capability
value =         (0x01) Error logging supported.
                    General Purpose Logging supported.

-------------------
key 6 = Short self-test routine
recommended polling time
value =     (   2) minutes.

-------------------
key 7 = Extended self-test routine
recommended polling time
value =     ( 100) minutes.

-------------------
key 8 = Conveyance self-test routine
recommended polling time
value =     (   3) minutes.

-------------------
票数 2
EN

Stack Overflow用户

发布于 2016-04-21 18:31:56

键和数据都是跨行分割的,因此我们必须处理这两种情况:

代码语言:javascript
复制
use strict;
use warnings;

my %data;

my $lastkey;

my $prefixkey = "";

while (my $smartdata = <DATA>) {
    chomp $smartdata;

    if ($smartdata =~ m/^\S/) {
        if ($smartdata =~ m/^([^:]+):\s+(.*)$/) { # is a complete or end of a key and data

            $lastkey = $prefixkey ? "$prefixkey $1" : $1;

            $data{$lastkey} = $2;

            $prefixkey = "";
        }
        else { # this is the start of a key
            $smartdata =~ s/(^\s+|\s+$)//; # strip whitespace
            $prefixkey = $smartdata;
        }
    }   
    else { # this is a data continuation
        $smartdata =~ s/(^\s+|\s+$)//; # strip whitespace
        $data{$lastkey} .= " $smartdata";
    }
}

for my $key (keys(%data)) {
    print("$key:\t$data{$key}\n");
}

__DATA__
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (  139) seconds.
Offline data collection
capabilities:            (0x73) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    ( 100) minutes.
Conveyance self-test routine
recommended polling time:    (   3) minutes.
SCT capabilities:          (0x1081) SCT Status supported.

生产:

代码语言:javascript
复制
Error logging capability:   (0x01) Error logging supported. General Purpose Logging supported.
Total time to complete Offline data collection: (  139) seconds.
SCT capabilities:   (0x1081) SCT Status supported.
Offline data collection capabilities:   (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer.
Conveyance self-test routine recommended polling time:  (   3) minutes.
Self-test execution status: (   0) The previous self-test routine completed without error or no self-test has ever  been run.
Extended self-test routine recommended polling time:    ( 100) minutes.
Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled.
Short self-test routine recommended polling time:   (   2) minutes.
票数 1
EN

Stack Overflow用户

发布于 2016-04-21 18:44:01

这些数据的格式不是最好的,但至少是可以预测的。我们可以根据每一行的开头来解析它。

代码语言:javascript
复制
use strict;
use warnings;
use Data::Dumper;

my %data;
my $key;
my $record;

while (<DATA>) {
    chomp;

    if (s/^\s+/ /g) {
        $record .= $_;
    } elsif (s/^([^:]+):\s\s+//) {
        if (length($record)) {
            $data{$key} = $record;
            $key = '';
        }

        $key .= $1;
        $record = $_;
    } else {
        $data{$key} = $record;
        $key = $_ . ' ';
        $record = '';
    }
}

$data{$key} = $record;
print Dumper(\%data);

__DATA__
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:        (  139) seconds.
Offline data collection
capabilities:            (0x73) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    ( 100) minutes.
Conveyance self-test routine
recommended polling time:    (   3) minutes.
SCT capabilities:          (0x1081) SCT Status supported.

输出:

代码语言:javascript
复制
$VAR1 = {
          'Error logging capability' => '(0x01) Error logging supported. General Purpose Logging supported.',
          'Total time to complete Offline data collection' => '(  139) seconds.',
          'SCT capabilities' => '(0x1081) SCT Status supported.',
          'Offline data collection capabilities' => '(0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported.',
          'SMART capabilities' => '(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer.',
          'Conveyance self-test routine recommended polling time' => '(   3) minutes.',
          'Self-test execution status' => '(   0) The previous self-test routine completed without error or no self-test has ever been run.',
          'Extended self-test routine recommended polling time' => '( 100) minutes.',
          'Offline data collection status' => '(0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled.',
          'Short self-test routine recommended polling time' => '(   2) minutes.'
        };
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/36776365

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档