我有一堆XML文件,我希望检测并删除其中的空标记。像这样:
<My></My>
<Your/>
<sometags>
<his>
</his>
<hasContent>sdfaf</hasContent>
</sometags>它们是我希望删除的各种空标签(My、Your、his)。PowerShell是否支持这种空标签检测,不管它们嵌入到其他标记中有多深?
发布于 2015-05-27 06:40:25
function Format-XML
{
param (
[parameter(Mandatory = $true)][xml] $xml,
[parameter(Mandatory = $false)][int] $indent = 4
)
try
{
$Error.Clear()
$StringWriter = New-Object System.IO.StringWriter
$XmlWriter = New-Object System.XMl.XmlTextWriter $StringWriter
$xmlWriter.Formatting = "indented"
$xmlWriter.Indentation = $indent
$xml.WriteContentTo($XmlWriter)
$XmlWriter.Flush()
$StringWriter.Flush()
return $StringWriter.ToString()
}
catch
{
Write-Host "$($MyInvocation.InvocationName): $_"; return $null
}
}
$xml = [xml] @"
<document>
<My></My>
<Your/>
<sometags>
<his>
</his>
<hasContent>sdfaf</hasContent>
</sometags>
</document>
"@
# The "magic" part is in this XPath expression
$nodes = $xml.SelectNodes("//*[count(@*) = 0 and count(child::*) = 0 and not(string-length(text())) > 0]")
$nodes | %{
$_.ParentNode.RemoveChild($_)
}
Format-Xml $xml发布于 2015-05-27 07:18:31
我对powershell不太熟悉,所以在@DavidBrabant的好答案中,特别是XPathpart.xpath来检测空元素可以简单一点:
//*[not(@*) and not(*) and normalize-space()]谓词([]中的所有内容)依次检查当前元素是否没有属性、没有子元素和没有空文本节点。
发布于 2015-05-27 07:22:18
您应该寻找使用System.Xml.XmlDocument的解决方案。但是它也可以使用regex:
$xml = @"
<document>
<My></My>
<Your/>
<sometags>
<his>
</his>
<hasContent>sdfaf</hasContent>
</sometags>
</document>
"@
$xml -replace '(?:<(\w*)>\s*<\/\1>)|<(\w*)\/>', ''https://stackoverflow.com/questions/30474517
复制相似问题