我试图将某个标签中的括号替换为标签外的括号,即,如果在标签后面有一个开始括号,或者在结束标记之前有一个结束括号。示例:
<italic>(When a parenthetical sentence stands on its own)</italic>
<italic>(When a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its own)</italic>这些行文应改为:
(<italic>When a parenthetical sentence stands on its own</italic>)
(<italic>When a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its own</italic>)然而,像下面的三个字符串应该保持不变。
<italic>(When) a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its (own)</italic>
<italic>When a parenthetical sentence stands (on) its own</italic>但是,下面的字符串:
<italic>((When) a parenthetical sentence stands on its own</italic>
<italic>((When) a parenthetical sentence stands on its own)</italic>
<italic>(When) a parenthetical sentence stands on its own)</italic>
<italic>When a parenthetical sentence stands on its (own))</italic>
<italic>(When a parenthetical sentence stands on its (own)</italic>应在替换后:
(<italic>(When) a parenthetical sentence stands on its own</italic>
(<italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>When a parenthetical sentence stands on its (own)</italic>)
(<italic>When a parenthetical sentence stands on its (own)</italic>可以在<italic>...</italic>标记中嵌套标记,一行可以包含多个<italic>...</italic>字符串。另外,如果在<inline-formula>...</inline-formula>中有一个嵌套的标记<italic>...</italic>,那么应该忽略这些标记。
我能用regex做这个吗?如果不是的话,我还能怎么做呢?
我的方法是这样(我仍然不确定它是否涵盖所有可能的情况):
第一步:<italic>( ---> (<italic> find <italic>(如果标签后面没有匹配的括号,后面没有结束标记,则只允许在一行内进行匹配。
查找:(<(italic)>)(?!(\((?>(?:(?![()\r\n]).)++|(?3))*+\))(?!</$2\b))(\()替换为:$4$1
第二步:)</italic> ---> </italic>)查找)</italic>如果标签前面没有匹配的括号,前面没有开头标签,则只允许在一行内进行匹配。
(\))(?<!(?<!<(italic)>)(\((?>(?:(?![()\r\n]).)++|(?3))*+\)))(</2\b>)
发布于 2017-12-07 18:19:50
您可以通过几种不同的方式来完成这一任务,首先定义标记何时是可替换的。
这个问题似乎有助于解析器方法并跟踪括号状态(标记文本的开头是否有括号,以及当前点的括号是如何嵌套的)。编写解析器将允许我们以建设性的方式进行替换,而不是使用regex进行搜索,并替换子字符串,并且是自然递归的,可以处理嵌套。用正则表达式做这件事似乎有点费解。这是我想出来的。
using System;
using System.IO;
using System.Text;
namespace ParenParser {
public class Program
{
public static Stream GenerateStreamFromString(string s)
{
MemoryStream stream = new MemoryStream();
StreamWriter writer = new StreamWriter(stream);
writer.Write(s);
writer.Flush();
stream.Position = 0;
return stream;
}
public static String Process(StreamReader s) { // root
StringBuilder output = new StringBuilder();
while (!s.EndOfStream) {
var ch = Convert.ToChar(s.Read());
if (ch == '<') {
output.Append(ProcessTag(s, true));
} else {
output.Append(ch);
}
}
return output.ToString();
}
public static String ProcessTag(StreamReader s, bool skipOpeningBracket = true) {
int currentParenDepth = 0;
StringBuilder openingTag = new StringBuilder(), allTagText = new StringBuilder(), closingTag = new StringBuilder();
bool inOpeningTag = false, inClosingTag = false;
if (skipOpeningBracket) {
inOpeningTag = true;
openingTag.Append('<');
skipOpeningBracket = false;
}
while (!s.EndOfStream) {
var ch = Convert.ToChar(s.Read());
if (ch == '<') { // start of a tag
var nextCh = Convert.ToChar(s.Peek());
if (nextCh == '/') { // closing tag!
closingTag.Append(ch);
inClosingTag = true;
} else if (openingTag.ToString().Length != 0) { // already seen a tag, recurse
allTagText.Append(ProcessTag(s, true));
continue;
} else {
openingTag.Append(ch);
inOpeningTag = true;
}
}
else if (inOpeningTag) {
openingTag.Append(ch);
if (ch == '>') {
inOpeningTag = false;
}
}
else if (inClosingTag) {
closingTag.Append(ch);
if (ch == '>') {
// Done!
var allTagTextString = allTagText.ToString();
if (allTagTextString.Length > 0 && allTagTextString[0] == '(' && allTagTextString[allTagTextString.Length - 1] == ')' && currentParenDepth == 0) {
return "(" + openingTag.ToString() + allTagTextString.Substring(1, allTagTextString.Length - 2) + closingTag.ToString() + ")";
} else if (allTagTextString.Length > 0 && allTagTextString[0] == '(' && currentParenDepth > 0) { // unclosed
return "(" + openingTag.ToString() + allTagTextString.Substring(1, allTagTextString.Length - 1) + closingTag.ToString();
} else if (allTagTextString.Length > 0 && allTagTextString[allTagTextString.Length - 1] == ')' && currentParenDepth < 0) { // unopened
return openingTag.ToString() + allTagTextString.Substring(0, allTagTextString.Length - 1) + closingTag.ToString() + ")";
} else {
return openingTag.ToString() + allTagTextString + closingTag.ToString();
}
}
}
else
{
allTagText.Append(ch);
if (ch == '(') {
currentParenDepth++;
}
else if (ch == ')') {
currentParenDepth--;
}
}
}
return openingTag.ToString() + allTagText.ToString() + closingTag.ToString();
}
public static void Main()
{
var testCases = new String[] {
// Should change
"<italic>(When a parenthetical sentence stands on its own)</italic>",
"<italic>(When a parenthetical sentence stands on its own</italic>",
"<italic>When a parenthetical sentence stands on its own)</italic>",
// Should remain unchanged
"<italic>(When) a parenthetical sentence stands on its own</italic>",
"<italic>When a parenthetical sentence stands on its (own)</italic>",
"<italic>When a parenthetical sentence stands (on) its own</italic>",
// Should be changed
"<italic>((When) a parenthetical sentence stands on its own</italic>",
"<italic>((When) a parenthetical sentence stands on its own)</italic>",
"<italic>(When) a parenthetical sentence stands on its own)</italic>",
"<italic>When a parenthetical sentence stands on its (own))</italic>",
"<italic>(When a parenthetical sentence stands on its (own)</italic>",
// Other cases
"<italic>(Try This on!)</italic>",
"<italic><italic>(Try This on!)</italic></italic>",
"<italic></italic>",
"",
"()",
"<italic>()</italic>",
"<italic>"
};
foreach(var testCase in testCases) {
using(var testCaseStreamReader = new StreamReader(GenerateStreamFromString(testCase))) {
Console.WriteLine(testCase + " --> " + Process(testCaseStreamReader));
}
}
}
}
}测试用例结果看起来类似于
<italic>(When a parenthetical sentence stands on its own</italic> --> (<italic>When a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its own)</italic> --> <italic>When a parenthetical sentence stands on its own</italic>)
<italic>(When) a parenthetical sentence stands on its own</italic> --> <italic>(When) a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its (own)</italic> --> <italic>When a parenthetical sentence stands on its (own)</italic>
<italic>When a parenthetical sentence stands (on) its own</italic> --> <italic>When a parenthetical sentence stands (on) its own</italic>
<italic>((When) a parenthetical sentence stands on its own</italic> --> (<italic>(When) a parenthetical sentence stands on its own</italic>
<italic>((When) a parenthetical sentence stands on its own)</italic> --> (<italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>(When) a parenthetical sentence stands on its own)</italic> --> <italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>When a parenthetical sentence stands on its (own))</italic> --> <italic>When a parenthetical sentence stands on its (own)</italic>)
<italic>(When a parenthetical sentence stands on its (own)</italic> --> (<italic>When a parenthetical sentence stands on its (own)</italic>
<italic>(Try This on!)</italic> --> (<italic>Try This on!</italic>)
<italic><italic>(Try This on!)</italic></italic> --> (<italic><italic>Try This on!</italic></italic>)
<italic></italic> --> <italic></italic>
-->
() --> ()
<italic>()</italic> --> (<italic></italic>)
<italic> --> <italic>https://stackoverflow.com/questions/47699020
复制相似问题