我有一个xml文件,我试图解析它。它有一些像
**</data_item>
</data_item>
</data_item>**
<xml version>
</data_item>
<some random text>
</data_item>
<some random text>
**</data_item>
</data_item>
</data_item>**
<xml version>
**</data_item>
</data_item>
</data_item>**用粗体突出显示的行有3 data_items,背靠背(除了最后一组3),我想删除其中的两行,只保留1。有7-8次这样的情况,我正在尝试使用字符串xml版本来访问上面的两行并删除它们。请帮我弄一条这样做的集邮班轮。
发布于 2014-05-08 22:28:19
下面是用GNU sed实现它的一种方法
$ sed '/<\/data_item>/{N;/<\/data_item>$/{N;$!{s/\n//;D}}}' file
</data_item>
<xml version>
</data_item>
<some random text>
</data_item>
<some random text>
</data_item>
<xml version>
</data_item>
</data_item>
</data_item>解释:
sed '
/<\/data_item>/ { # Look for lines matching this pattern
N # Append the next line to pattern space
/<\/data_item>$/ { # If the line matches our pattern
N # Append the next line to pattern space
$! { # If it is not end of file
s/\n// # Replace the first new line with nothing
D # Delete up to first newline
}
}
}
' filehttps://stackoverflow.com/questions/23553456
复制相似问题