文章/答案/技术大牛

发布

社区首页 >问答首页 >Perl:如何提取括号内的字符串

问Perl:如何提取括号内的字符串
EN

Stack Overflow用户

提问于 2012-09-05 04:39:58

回答 7查看 12.5K关注 0票数 1

我有一个moinmoin文本格式的文件：

* [[  Virtualbox Guest Additions]] (2011/10/17 15:19)
* [[  Abiword Wordprocessor]] (2010/10/27 20:17)
* [[  Sylpheed E-Mail]] (2010/03/30 21:49)
* [[   Kupfer]] (2010/05/16 20:18)

“[”和“]”之间的所有单词都是该条目的简短描述。我需要提取整个词条，但不是每个单词。

我在这里找到了一个类似问题的答案：https://stackoverflow.com/a/2700749/819596，但无法理解答案："my @array = $str =~ /( \{ (?: [^{}]* | (?0) )* \} )/xg;"

任何可行的东西都会被接受，但解释会有很大帮助，比如：(?0)或/xg做什么。

perl

matching

回答 7

Stack Overflow用户

发布于 2012-09-05 04:48:05

代码可能如下所示：

use warnings; 
use strict;

my @subjects; # declaring a lexical variable to store all the subjects
my $pattern = qr/ 
  \[ \[    # matching two `[` signs
  \s*      # ... and, if any, whitespace after them
  ([^]]+) # starting from the first non-whitespace symbol, capture all the non-']' symbols
  ]]
/x;

# main processing loop:
while (<DATA>) { # reading the source file line by line
  if (/$pattern/) {      # if line is matched by our pattern
    push @subjects, $1;  # ... push the captured group of symbols into our array
  }
}
print $_, "\n" for @subjects; # print our array of subject line by line

__DATA__
* [[  Virtualbox Guest Additions]] (2011/10/17 15:19)
* [[  Abiword Wordprocessor]] (2010/10/27 20:17)
* [[  Sylpheed E-Mail]] (2010/03/30 21:49)
* [[   Kupfer]] (2010/05/16 20:18)

如我所见，你需要的东西可以这样描述:在文件的每一行中，试着找到这个符号序列……

[[, an opening delimiter, 
then 0 or more whitespace symbols,
then all the symbols that make a subject (which should be saved),
then ]], a closing delimiter

如您所见，此描述很自然地转换为正则表达式。唯一不需要的可能是/x正则表达式修饰符，它允许我对其进行广泛的注释。)

票数 2

Stack Overflow用户

发布于 2012-09-05 05:18:55

如果文本永远不会包含]，您可以简单地按照前面的建议使用以下内容：

/\[\[ ( [^\]]* ) \]\]/x

下面允许在包含的文本中使用]，但我建议不要将其合并到更大的模式中：

/\[\[ ( .*? ) \]\]/x

以下代码允许在包含的文本中使用]，这是最可靠的解决方案：

/\[\[ ( (?:(?!\]\]).)* ) \]\]/x

例如,

if (my ($match) = $line =~ /\[\[ ( (?:(?!\]\]).)* ) \]\]/x) {
   print "$match\n";
}

或

my @matches = $file =~ /\[\[ ( (?:(?!\]\]).)* ) \]\]/xg;

/x：忽略模式中的空格。允许添加空格以使图案可读，而不会更改图案的含义。在perlre.
/g：中记录查找所有匹配项。perlop.
(?0)中的文档被用来使模式递归，因为链接的节点必须处理任意的卷曲嵌套。* /g：查找所有匹配项。在perlre.

中记录

票数 2

Stack Overflow用户

发布于 2012-09-05 04:48:19

\[\[(.*)]]

\[是一个文字，`is a literal ],.*`表示每个0或更多字符的序列，括号中的东西是一个捕获组，因此您可以稍后在脚本中使用$1 (或$2 )来访问它。9美元，这取决于你有多少组)。

把所有这些放在一起，你将匹配两个[，然后匹配直到两个连续]的最后一个出现的所有内容

在第二次阅读你的问题时更新我突然感到困惑，你需要[和]之间的内容，还是整行-在这种情况下，完全去掉括号，只测试模式是否匹配，不需要捕获。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/12271040

复制

相似问题

问Perl:如何提取括号内的字符串
EN

回答 7

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Perl:如何提取括号内的字符串EN

回答 7

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Perl:如何提取括号内的字符串
EN