我想要HTML-敏捷-包关闭任何打开的‘选项’标签,同时仍保留内文字。我的目标是捕捉以下内容:
我编写的C#代码依赖于将出现在内部文本之后的选项结束标记。
这里是原始的HTML:
<select id="Province" >
<option value=""> -- Select province --</option>
<option value="1">Alberta
<option value="2">British Columbia
<option value="3">Manitoba
<option value="4">New Brunswick
<option value="5">Newfoundland
<option value="6">Northwest Territories
<option value="7">Nova Scotia
<option value="8">Nunavut
<option value="9">Ontario
<option value="10">Prince Edward Island
<option value="11">Quebec
<option value="12">Saskatchewan
<option value="13">Yukon
</select>由格式化的HTML:
<select id="Province" >
<option value=""> -- Select province --</option>
<option value="1"></option>Alberta
<option value="2"></option>British Columbia
<option value="3"></option>Manitoba
<option value="4"></option>New Brunswick
<option value="5"></option>Newfoundland
<option value="6"></option>Northwest Territories
<option value="7"></option>Nova Scotia
<option value="8"></option>Nunavut
<option value="9"></option>Ontario
<option value="10"></option>Prince Edward Island
<option value="11"></option>Quebec
<option value="12"></option>Saskatchewan
<option value="13"></option>Yukon
</select>正如您所看到的,不考虑包含内部文本。是否可以在内部文本之后添加结束标记?
例如:
<option value="1">Alberta</option>下面是用于解析HTML的C#代码:
static void LoadProvinces()
{
//Read the HTML File and save it to the string 'rawProvinces'
System.IO.StreamReader myFile = new System.IO.StreamReader("ProvincesCheckout.htm");
string rawProvinces = myFile.ReadToEnd();
//This tells HTML-Agility-Pack to close all open Option Tags
HtmlNode.ElementsFlags["option"] = HtmlElementFlag.Closed;
//Load the rawProvinces string into HTML-Agility-Pack
HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(rawProvinces);
//Convert the parsed HTML to the string variable 'parsedHtml' and save it to 'hap.htm'
string parsedHtml = htmlDoc.DocumentNode.OuterHtml;
System.IO.StreamWriter file = new System.IO.StreamWriter("hap.htm");
file.WriteLine(parsedHtml);
file.Close();发布于 2014-05-21 13:17:15
因为某种原因它不起作用,但它应该起作用。尽管您也可以使用String类及其方法来完成这一任务:
// Get all option elements
HtmlNodeCollection nodes = htmlDoc.DocumentNode.SelectNodes("//option");
foreach (HtmlNode node in nodes)
{
// Get the outer position of the NextSibling (which would be the text we want to surround with </option>)
int nextPosition = rawProvinces.IndexOf(node.NextSibling.OuterHtml) + node.NextSibling.OuterHtml.Trim().Length;
// Check if there isn't already a </option> element
if (!rawProvinces.Substring(nextPosition, 8).StartsWith("</option"))
{
// Add the element
rawProvinces = rawProvinces.Insert(nextPosition, "</option>");
}
}https://stackoverflow.com/questions/23783272
复制相似问题