首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >RegExp替换嵌套结构中的匹配括号

RegExp替换嵌套结构中的匹配括号
EN

Stack Overflow用户
提问于 2016-05-13 16:26:58
回答 2查看 125关注 0票数 1

如果第一个开口括号紧跟在关键字array之后,我如何替换一组匹配的开始/结束括号?正则表达式能帮助解决这种类型的问题吗?

为了更具体地说明,我想使用或PHP来解决这个问题。

代码语言:javascript
复制
// input
$data = array(
    'id' => nextId(),
    'profile' => array(
       'name' => 'Hugo Hurley',
       'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
    )
);

// desired output
$data = [
    'id' => nextId(),
    'profile' => [
       'name' => 'Hugo Hurley',
       'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
    ]
];
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2016-05-13 17:50:09

给出了点-网计数版本。

它具有与下面的PCRE (php)版本相同的元素。

所有的警告都是一样的。特别是,非数组括号必须

平衡,因为它们使用与分隔符相同的结束括号。

所有文本必须被解析(或者应该解析)。

外部组1,2,3,4允许你得到零件。

内容

核心-1 array()

核心-2任何()

例外情况

每一场比赛都会给你带来这些外在的东西,并且是相互排斥的。

诀窍是定义一个解析核心的php函数parse( core)

该函数内部是while (regex.search( core ) { .. }循环。

每次核心-1或2组匹配时,调用传递的parse( core )函数。

核心小组的内容。

在循环内部,只需取出内容并将其分配给哈希。

显然,应该替换调用(?&content)的组1构造。

使用构造来获得类似变量数据的散列。

在一个详细的尺度上,这可能是非常乏味的。

通常情况下,你必须对每一个字符做出正确的解释

分析整件事。

代码语言:javascript
复制
(?is)(?:((?&content))|(?>\barray\s*\()((?=.)(?&core)|)\)|\(((?=.)(?&core)|)\)|(\barray\s*\(|[()]))(?(DEFINE)(?<core>(?>(?&content)|(?>\barray\s*\()(?:(?=.)(?&core)|)\)|\((?:(?=.)(?&core)|)\))+)(?<content>(?>(?!\barray\s*\(|[()]).)+))

扩容

代码语言:javascript
复制
 # 1:  CONTENT
 # 2:  CORE-1
 # 3:  CORE-2
 # 4:  EXCEPTIONS

 (?is)

 (?:
      (                                  # (1), Take off   CONTENT
           (?&content) 
      )
   |                                   # OR -----------------------------
      (?>                                # Start 'array('
           \b array \s* \(
      )
      (                                  # (2), Take off   'array( CORE-1 )'
           (?= . )
           (?&core) 
        |  
      )
      \)                                 # End ')'
   |                                   # OR -----------------------------
      \(                                 # Start '('
      (                                  # (3), Take off   '( any CORE-2 )'
           (?= . )
           (?&core) 
        |  
      )
      \)                                 # End ')'
   |                                   # OR -----------------------------
      (                                  # (4), Take off   Unbalanced or Exceptions
           \b array \s* \(
        |  [()] 
      )
 )

 # Subroutines
 # ---------------

 (?(DEFINE)

      # core
      (?<core>
           (?>
                (?&content) 
             |  
                (?> \b array \s* \( )
                # recurse core of  array()
                (?:
                     (?= . )
                     (?&core) 
                  |  
                )
                \)
             |  
                \(
                # recurse core of any  ()
                (?:
                     (?= . )
                     (?&core) 
                  |  
                )
                \)
           )+
      )

      # content 
      (?<content>
           (?>
                (?!
                     \b array \s* \(
                  |  [()] 
                )
                . 
           )+
      )
 )

输出

代码语言:javascript
复制
 **  Grp 0           -  ( pos 0 , len 11 ) 
some_var =   
 **  Grp 1           -  ( pos 0 , len 11 ) 
some_var =   
 **  Grp 2           -  NULL 
 **  Grp 3           -  NULL 
 **  Grp 4 [core]    -  NULL 
 **  Grp 5 [content] -  NULL 

-----------------------

 **  Grp 0           -  ( pos 11 , len 153 ) 
array(
    'id' => nextId(),
    'profile' => array(
       'name' => 'Hugo Hurley',
       'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
    ) 
)  
 **  Grp 1           -  NULL 
 **  Grp 2           -  ( pos 17 , len 146 ) 

    'id' => nextId(),
    'profile' => array(
       'name' => 'Hugo Hurley',
       'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
    ) 

 **  Grp 3           -  NULL 
 **  Grp 4 [core]    -  NULL 
 **  Grp 5 [content] -  NULL 

-------------------------------------

 **  Grp 0           -  ( pos 164 , len 3 ) 
;

 **  Grp 1           -  ( pos 164 , len 3 ) 
;

 **  Grp 2           -  NULL 
 **  Grp 3           -  NULL 
 **  Grp 4 [core]    -  NULL 
 **  Grp 5 [content] -  NULL 

其他事物的前一次化身,以获得使用的概念

代码语言:javascript
复制
 # Perl code:
 # 
 #     use strict;
 #     use warnings;
 #     
 #     use Data::Dumper;
 #     
 #     $/ = undef;
 #     my $content = <DATA>;
 #     
 #     # Set the error mode on/off here ..
 #     my $BailOnError = 1;
 #     my $IsError = 0;
 #     
 #     my $href = {};
 #     
 #     ParseCore( $href, $content );
 #     
 #     #print Dumper($href);
 #     
 #     print "\n\n";
 #     print "\nBase======================\n";
 #     print $href->{content};
 #     print "\nFirst======================\n";
 #     print $href->{first}->{content};
 #     print "\nSecond======================\n";
 #     print $href->{first}->{second}->{content};
 #     print "\nThird======================\n";
 #     print $href->{first}->{second}->{third}->{content};
 #     print "\nFourth======================\n";
 #     print $href->{first}->{second}->{third}->{fourth}->{content};
 #     print "\nFifth======================\n";
 #     print $href->{first}->{second}->{third}->{fourth}->{fifth}->{content};
 #     print "\nSix======================\n";
 #     print $href->{six}->{content};
 #     print "\nSeven======================\n";
 #     print $href->{six}->{seven}->{content};
 #     print "\nEight======================\n";
 #     print $href->{six}->{seven}->{eight}->{content};
 #     
 #     exit;
 #     
 #     
 #     sub ParseCore
 #     {
 #         my ($aref, $core) = @_;
 #         my ($k, $v);
 #         while ( $core =~ /(?is)(?:((?&content))|(?><!--block:(.*?)-->)((?&core)|)<!--endblock-->|(<!--(?:block:.*?|endblock)-->))(?(DEFINE)(?<core>(?>(?&content)|(?><!--block:.*?-->)(?:(?&core)|)<!--endblock-->)+)(?<content>(?>(?!<!--(?:block:.*?|endblock)-->).)+))/g )
 #         {
 #            if (defined $1)
 #            {
 #              # CONTENT
 #                $aref->{content} .= $1;
 #            }
 #            elsif (defined $2)
 #            {
 #              # CORE
 #                $k = $2; $v = $3;
 #                $aref->{$k} = {};
 #      #         $aref->{$k}->{content} = $v;
 #      #         $aref->{$k}->{match} = $&;
 #                
 #                my $curraref = $aref->{$k};
 #                my $ret = ParseCore($aref->{$k}, $v);
 #                if ( $BailOnError && $IsError ) {
 #                    last;
 #                }
 #                if (defined $ret) {
 #                    $curraref->{'#next'} = $ret;
 #                }
 #            }
 #            else
 #            {
 #              # ERRORS
 #                print "Unbalanced '$4' at position = ", $-[0];
 #                $IsError = 1;
 #     
 #                # Decide to continue here ..
 #                # If BailOnError is set, just unwind recursion. 
 #                # -------------------------------------------------
 #                if ( $BailOnError ) {
 #                   last;
 #                }
 #            }
 #         }
 #         return $k;
 #     }
 #     
 #     #================================================
 #     __DATA__
 #     some html content here top base
 #     <!--block:first-->
 #         <table border="1" style="color:red;">
 #         <tr class="lines">
 #             <td align="left" valign="<--valign-->">
 #         <b>bold</b><a href="http://www.mewsoft.com">mewsoft</a>
 #         <!--hello--> <--again--><!--world-->
 #         some html content here 1 top
 #         <!--block:second-->
 #             some html content here 2 top
 #             <!--block:third-->
 #                 some html content here 3 top
 #                 <!--block:fourth-->
 #                     some html content here 4 top
 #                     <!--block:fifth-->
 #                         some html content here 5a
 #                         some html content here 5b
 #                     <!--endblock-->
 #                 <!--endblock-->
 #                 some html content here 3a
 #                 some html content here 3b
 #             <!--endblock-->
 #             some html content here 2 bottom
 #         <!--endblock-->
 #         some html content here 1 bottom
 #     <!--endblock-->
 #     some html content here1-5 bottom base
 #     
 #     some html content here 6-8 top base
 #     <!--block:six-->
 #         some html content here 6 top
 #         <!--block:seven-->
 #             some html content here 7 top
 #             <!--block:eight-->
 #                 some html content here 8a
 #                 some html content here 8b
 #             <!--endblock-->
 #             some html content here 7 bottom
 #         <!--endblock-->
 #         some html content here 6 bottom
 #     <!--endblock-->
 #     some html content here 6-8 bottom base
 # 
 # Output >>
 # 
 #     Base======================
 #     some html content here top base
 #     
 #     some html content here1-5 bottom base
 #     
 #     some html content here 6-8 top base
 #     
 #     some html content here 6-8 bottom base
 #     
 #     First======================
 #     
 #         <table border="1" style="color:red;">
 #         <tr class="lines">
 #             <td align="left" valign="<--valign-->">
 #         <b>bold</b><a href="http://www.mewsoft.com">mewsoft</a>
 #         <!--hello--> <--again--><!--world-->
 #         some html content here 1 top
 #         
 #         some html content here 1 bottom
 #     
 #     Second======================
 #     
 #             some html content here 2 top
 #             
 #             some html content here 2 bottom
 #         
 #     Third======================
 #     
 #                 some html content here 3 top
 #                 
 #                 some html content here 3a
 #                 some html content here 3b
 #             
 #     Fourth======================
 #     
 #                     some html content here 4 top
 #                     
 #                 
 #     Fifth======================
 #     
 #                         some html content here 5a
 #                         some html content here 5b
 #                     
 #     Six======================
 #     
 #         some html content here 6 top
 #         
 #         some html content here 6 bottom
 #     
 #     Seven======================
 #     
 #             some html content here 7 top
 #             
 #             some html content here 7 bottom
 #         
 #     Eight======================
 #     
 #                 some html content here 8a
 #                 some html content here 8b
 #         
票数 3
EN

Stack Overflow用户

发布于 2016-05-13 16:54:57

下面(使用.NET regex引擎)如何?

代码语言:javascript
复制
resultString = Regex.Replace(subjectString, 
    @"\barray\(            # Match 'array('
    (                      # Capture in group 1:
     (?>                   # Start a possessive group:
      (?:                  # Either match
       (?!\barray\(|[()])  # only if we're not before another array or parens
       .                   # any character
      )+                   # once or more
     |                     # or
      \( (?<Depth>)        # match '(' (and increase the nesting counter)
     |                     # or
      \) (?<-Depth>)       # match ')' (and decrease the nesting counter).
     )*                    # Repeat as needed.
     (?(Depth)(?!))        # Assert that the nesting counter is at zero.
    )                      # End of capturing group.
    \)                     # Then match ')'.", 
    "[$1]", RegexOptions.IgnorePatternWhitespace | RegexOptions.Singleline);

此正则表达式与array(...)匹配,其中...可能包含除另一个array(...)之外的任何内容(因此,它只匹配嵌套最深入的事件)。它确实允许在...中使用其他嵌套的(并且正确平衡的)括号,但是它不检查它们是否是语义括号,或者它们是否包含在字符串或注释中。

换句话说,就像

代码语言:javascript
复制
array(
   'name' => 'Hugo ((( Hurley',
   'numbers' => (4 + 8 + 15 + 16 + 23 + 42) / 108
)

无法(正确)匹配。

您需要迭代地应用该regex,直到它不再修改其输入为止--在您的示例中,两个迭代就足够了。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/37215006

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档