首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >将ONIX XML作为dataset导入,忽略HTML标记

将ONIX XML作为dataset导入,忽略HTML标记
EN

Stack Overflow用户
提问于 2013-07-16 09:31:16
回答 1查看 346关注 0票数 0

不久前,我写了一个将ONIX文件导入零售数据库系统的过程。(ONIX是出版商用来发布其目录信息的XML标准。)该过程将XML文件直接导入到dataset中,并且对于我们接收到的大多数文件来说工作得足够好,但偶尔也会有例外。

在本例中,我试图导入的文件在产品描述字段中包含了HTML标记,这严重影响了标准的Dataset.ReadXML()方法,因为它试图将HTML标记解释为可扩展标记语言。一些ONIX文件包含CDATA标记,可以避免这个问题,但是在这种情况下,发布者选择使用标记属性来指定字段为HTML格式,如下所示:

代码语言:javascript
复制
    <othertext>
        <d102>03</d102>
        <d104 textformat="05">
            <p>Enter a world where bloody battles, and heroic deeds combine in the historic struggle to unite Britain in the face of a common enemy.</p>
            <p>The third instalment in Bernard Cornwell’s King Alfred series, follows on from the outstanding previous novels The Last Kingdom and The Pale Horseman.</p>
            <p>The year is 878 and the Vikings have been thrown out of Wessex. Uhtred, fresh from fighting for Alfred in the battle to free Wessex, travels north to seek revenge for his father's death, killed in a bloody raid by Uhtred's old enemy, renegade Danish lord, Kjartan.</p>
            <p>While Kjartan lurks in his formidable stronghold of Dunholm, the north is overrun by chaos, rebellion and fear. Together with a small band of warriors, Uhtred plans his attack on his enemy, revenge fuelling his anger, resolute on bloody retribution. But, he finds himself betrayed and ends up on a desperate slave voyage to Iceland. Rescued by a remarkable alliance of old friends and enemies, he and his allies, together with Alfred the Great, are free to fight once more in a battle for power, glory and honour.</p>
            <p>‘The Lords of the North’ is a tale of England's making, a powerful story of betrayal, struggle and romance, set in an England torn apart by turmoil and upheaval.</p>
        </d104>
    </othertext>

HTML“05”属性表示textformat=。

如果不编写自定义代码来解释超文本标记语言,是否仍然可以使用ReadXML()导入它,或者我是否需要首先以编程方式插入CDATA标记来解决这个问题?

注意:我不想去掉HTML标签,因为数据会出现在网站上。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2013-07-16 11:13:03

下面是一个用Linqpad编写的程序,它可以找到textformat=05节点并将其内容包装在CData部分中。查看此stackoverflow post

代码语言:javascript
复制
void Main()
{
    string xml = @"<othertext>
            <d102>03</d102>
            <d104 textformat=""05"">
                <p>Enter a world where bloody battles, and heroic deeds combine in the historic struggle to unite Britain in the face of a common enemy.</p>
                <p>The third instalment in Bernard Cornwell’s King Alfred series, follows on from the outstanding previous novels The Last Kingdom and The Pale Horseman.</p>
                <p>The year is 878 and the Vikings have been thrown out of Wessex. Uhtred, fresh from fighting for Alfred in the battle to free Wessex, travels north to seek revenge for his father's death, killed in a bloody raid by Uhtred's old enemy, renegade Danish lord, Kjartan.</p>
                <p>While Kjartan lurks in his formidable stronghold of Dunholm, the north is overrun by chaos, rebellion and fear. Together with a small band of warriors, Uhtred plans his attack on his enemy, revenge fuelling his anger, resolute on bloody retribution. But, he finds himself betrayed and ends up on a desperate slave voyage to Iceland. Rescued by a remarkable alliance of old friends and enemies, he and his allies, together with Alfred the Great, are free to fight once more in a battle for power, glory and honour.</p>
                <p>‘The Lords of the North’ is a tale of England's making, a powerful story of betrayal, struggle and romance, set in an England torn apart by turmoil and upheaval.</p>
            </d104>
        </othertext>";

    XmlDocument xmlDoc = new XmlDocument();
    xmlDoc.LoadXml(xml);
    var nodes = xmlDoc.SelectNodes("//othertext/*[@textformat='05']");
    foreach(XmlNode node in nodes)
    {
        var cdata = xmlDoc.CreateCDataSection(node.InnerXml);
        node.InnerText = string.Empty;
        node.AppendChild(cdata);
        node.InnerXml.Dump(); 
    }
}
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/17666613

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档