文章/答案/技术大牛

发布

社区首页 >问答首页 >在perl脚本中用VCG*替换VCG1或VCG2

问在perl脚本中用VCG*替换VCG1或VCG2
EN

Stack Overflow用户

提问于 2014-09-23 03:25:25

回答 1查看 53关注 0票数 0

在我的前一个问题(https://stackoverflow.com/a/25735444/3767980)中，在jaypal的帮助下，我能够为矛盾的和非矛盾的案件设置我的约束。让我们考虑这里的模棱两可，因为它是比较困难的。

我有一些看起来像

G6N-D5C-?: (116.663, 177.052, 29.149) K87CD/E85CB/E94CB/H32CB/Q21CB
L12N-T11C-?: (128.977, 175.109, 174.412) K158C/H60C/A152C/N127C/Y159C(notH60C)
K14N-E13C-?: (117.377, 176.474, 29.823) I187CG1/V78CG2
A75N-Q74C-?: (123.129, 177.253, 23.513) V131CG1/V135CG1/V78CG1

并受以下perl脚本的约束：

#!/usr/bin/perl 

use strict;
use warnings;
use autodie;
# 

open my $fh, '<', $ARGV[0];

while (<$fh>) {
    my @values = map { /.(\d+)(\w+)/; $1, $2 } split '/', (split)[-1];
    my ( $resid, $name ) = /^[^-]+-.(\d+)(\w+)-/;
    print "assign (resid $resid and name $name ) (";
    print join ( " or ", 
        map  { "resid $values[$_] and name $values[$_ + 1]" } 
        grep { not $_ % 2 } 0 .. $#values 
    );
    print " ) 3.5 2.5 4.5 ! $_";
}

产出：

assign (resid 5 and name C ) (resid 87 and name CD or resid 85 and name CB or resid 94 and name CB or resid 32 and name CB or resid 21 and name CB ) 3.5 2.5 8.5 ! G6N-D5C-?: (116.663, 177.052, 29.149) K87CD/E85CB/E94CB/H32CB/Q21CB
assign (resid 11 and name C ) (resid 158 and name C or resid 60 and name C or resid 152 and name C or resid 127 and name C or resid 159 and name C ) 3.5 2.5 8.5 ! L12N-T11C-?: (128.977, 175.109, 174.412) K158C/H60C/A152C/N127C/Y159C(notH60C)
assign (resid 13 and name C ) (resid 187 and name CG1 or resid 78 and name CG2 ) 3.5 2.5 8.5 ! K14N-E13C-?: (117.377, 176.474, 29.823) I187CG1/V78CG2
assign (resid 74 and name C ) (resid 131 and name CG1 or resid 135 and name CG2 or resid 78 and name CG1 ) 3.5 2.5 8.5 ! A75N-Q74C-?: (123.129, 177.253, 23.513) V131CG1/V135CG1/V78CG1

我需要帮助的是包含以V开头的条目的行，后面是2或3位数字，以及CG1或!后面的CG2。例如V78CG2或V135CG1。
我需要使用通配符来处理相应的条目。也就是说，我需要一些约束，就像：

assign (resid 5 and name C ) (resid 87 and name CD or resid 85 and name CB or resid 94 and name CB or resid 32 and name CB or resid 21 and name CB ) 3.5 2.5 8.5 ! G6N-D5C-?: (116.663, 177.052, 29.149) K87CD/E85CB/E94CB/H32CB/Q21CB
assign (resid 11 and name C ) (resid 158 and name C or resid 60 and name C or resid 152 and name C or resid 127 and name C or resid 159 and name C ) 3.5 2.5 8.5 ! L12N-T11C-?: (128.977, 175.109, 174.412) K158C/H60C/A152C/N127C/Y159C(notH60C)
assign (resid 13 and name C ) (resid 187 and name CG1 or resid 78 and name CG* ) 3.5 2.5 8.5 ! K14N-E13C-?: (117.377, 176.474, 29.823) I187CG1/V78CG2
assign (resid 74 and name C ) (resid 131 and name CG* or resid 135 and name CG* or resid 78 and name CG* ) 3.5 2.5 8.5 ! A75N-Q74C-?: (123.129, 177.253, 23.513) V131CG1/V135CG1/V78CG1

我需要建议选择匹配行，然后将应用的转换应用到集群输入(在!之前)。我可以找到与V.*CG[1-2]的基本正则表达式相匹配的行。

我想在上面的perl脚本中找到一个解决方案。

如果有什么不清楚的地方，请评论。我还是很新的。我预先感谢你的建议。

perl

bash

回答 1

Stack Overflow用户

回答已采纳

发布于 2014-09-23 10:03:01

这是一个修改后的脚本版本，并解释了正在发生的事情。要理解my @values = map { ... } split '/', (split)[-1];有点棘手，因此我将分别解释：

map接受一个数组并将大括号中的任何内容应用于数组的每个成员，并输出一个新的数组。这两个split是用来劈线的。如果使用时没有任何参数，split将$_作为输入，并在空格上拆分。因此，第一个split接受$_，这是当前行，并将其除以空格：

input:
'G6N-D5C-?: (116.663, 177.052, 29.149) K87CD/E85CB/E94CB/H32CB/Q21CB'

the array created by calling split:
'G6N-D5C-?:', '(116.663,', '177.052,', '29.149)', 'K87CD/E85CB/E94CB/H32CB/Q21CB'

第二个split在/上拆分输入；作为输入，它使用第一个split创建的数组中的最后一个项--即(split)是“通过在空格上拆分$_创建的数组”的缩写，(split)[-1]是数组的最后一个元素。

input: 
K87CD/E85CB/E94CB/H32CB/Q21CB

array created by calling `split "/"`
'K87CD', 'E85CB', 'E94CB', 'H32CB', 'Q21CB'

map命令然后对该数组的每个成员应用regex：

/.(\d+)(\w+)/; # match any character (.) followed by one or more digits (\d)  
               # followed by one or more alphanumeric (\w) characters.

括号将结果捕获到只读变量$1和$2中。映射中的第二个语句将这些字符添加到由map命令创建的数组中。默认情况下，perl将最后一条语句的结果放入数组中，因此您可以执行如下操作：

my @arr = (1, 2, 3, 4);
my @two_times = map { $_ * 2 } @arr;
# @two_times is (2, 4, 6, 8)

(模式匹配的“结果”实际上是$1和$2，所以将它们添加到@values数组的语句@values并不是绝对必要的。)

因此，@values = map { /.(\d+)(\w+)/; $1, $2 } @array从@array中的每个元素中捕获匹配，并将它们放在@values中。

我希望脚本的其余部分是可以理解的；如果不是，我建议拆分每个命令并使用Data::Dumper检查结果，这样您就可以知道发生了什么。

为了改变脚本以不同的方式对待VnnCG1 / VnnCG2条目，我在map命令中添加了一行，该行查找与模式匹配的任何残留物，并将其替换为VnnCG*。然后，我修改了匹配的regex，以便它能够获取剩余名称的适当部分，但不会获取不适当的数据(如(notB28DG))。下面是带有注释的新脚本：

#!/usr/bin/perl 
use strict;
use warnings;
use feature ':5.10';
use autodie;

open my $fh, '<', $ARGV[0];

while (<$fh>) {

    # a brief guide to regexps:
    # \d     = digits
    # \w     = digits or letters or _
    # [ ]    = match any of the characters within these brackets
    # ( )    = capture the value in these brackets, save it to $1, $2, $3, etc.
    #        (brackets are also used for alternation, but not in this case)
    # *      = match 0 or 1 times
    # +      = match 1 or more times
    # \*     = match the character *
    # s/ / / = search and replace
    # /x     = ignore whitespace

    my @values = map {
        # find the pattern
        s/V     # V
        (\d+)   # one or more digits; the brackets mean we capture the value
                # and it gets saved in $1
        CG      # CG
        [12]    # either 1 or 2
        /V$1CG*/x; #replace with V $1 CG *

        # find the pattern
        /.       # any character
        (\d+)    # one or more digits; capture the value in $1
        ([A-Z][\w\*]*) # a letter followed by zero or more alphanum or * 
        /x;            # the value is captured in $2

        # put $1 and $2 into the array we're building
        $1, $2
        } split '/', (split)[-1];

    my ( $resid, $name ) = /^[^-]+-.(\d+)(\w+)-/;
    # compose the new string
    my $str = "assign (resid $resid and name $name ) ("
    . join ( " or ",
        map  { "resid $values[$_] and name $values[$_ + 1]" }
        grep { not $_ % 2 } 0 .. $#values
    )
    . " ) 3.5 2.5 8.5 ! $_";
    # "say" prints out the string to STDERR and automatically adds a carriage return
    say $str;
}

“核心”脚本的简短版本，没有评论：

foreach (@data) {
    my @values = map {
        s/V(\d+)CG[12]/V$1CG*/; /.(\d+)([A-Z][\w\*]*)/;
        } split '/', (split)[-1];
    my ( $resid, $name ) = /^[^-]+-.(\d+)(\w+)-/;
    say "assign (resid $resid and name $name ) ("
    . join ( " or ",
        map  { "resid $values[$_] and name $values[$_ + 1]" }
        grep { not $_ % 2 } 0 .. $#values
    )
    . " ) 3.5 2.5 8.5 ! $_";
}

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/25986362

复制

相似问题

问在perl脚本中用VCG*替换VCG1或VCG2
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在perl脚本中用VCG*替换VCG1或VCG2EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在perl脚本中用VCG*替换VCG1或VCG2
EN