首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >HtmlAgilityPack中的NullReferenceException

HtmlAgilityPack中的NullReferenceException
EN

Stack Overflow用户
提问于 2012-04-14 15:50:12
回答 1查看 5.6K关注 0票数 5

我正在尝试使用xpath从下面提到的url中提取link

代码语言:javascript
复制
string url = "http://www.album-cover-art.org/search.php?q=Ruin+-+Live+Album+Version+Lamb+of+God"

我的代码:

代码语言:javascript
复制
HtmlAgilityPack.HtmlWeb web = new HtmlAgilityPack.HtmlWeb();
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc = web.Load(url); //Exception generated here Line 23

if (htmlDoc.DocumentNode != null)
{
  HtmlNode linkNode = htmlDoc.DocumentNode.SelectSingleNode(".//*[@id='related_search_row']/img/@src");
  if (linkNode != null)
        Console.WriteLine(linkNode.InnerText);
}

上面的代码编译得很好,但是当我尝试运行它时,它会产生一个异常

代码语言:javascript
复制
Unhandled Exception: System.NullReferenceException: Object reference not set to an instance of an object.

完整的堆栈跟踪

代码语言:javascript
复制
System.NullReferenceException: Object reference not set to an instance of an object.
   at HtmlAgilityPack.HtmlDocument.ReadDocumentEncoding(HtmlNode node) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 1916
   at HtmlAgilityPack.HtmlDocument.PushNodeEnd(Int32 index, Boolean close) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 1805
   at HtmlAgilityPack.HtmlDocument.Parse() in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 1468
   at HtmlAgilityPack.HtmlDocument.Load(TextReader reader) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 769
   at HtmlAgilityPack.HtmlWeb.Get(Uri uri, String method, String path, HtmlDocument doc, IWebProxy proxy, ICredentials creds) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1515
   at HtmlAgilityPack.HtmlWeb.LoadUrl(Uri uri, String method, WebProxy proxy, NetworkCredential creds) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1563
   at HtmlAgilityPack.HtmlWeb.Load(String url, String method) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1149
   at HtmlAgilityPack.HtmlWeb.Load(String url) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1107
   at ScreenScrapping.Program.Main(String[] args) in c:\Users\ranveer\csharp\ScreenScrapping\ScreenScrapping\Program.cs:line 23

所以,我的问题是为什么我会得到这个例外。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2012-04-14 16:29:46

这是HtmlAgilityPack中的一个错误。您试图解析的文档具有<meta http-equiv="Content-Type" content="text/html; charset=iso-utf-8">,其中charset值(iso-utf-8)不能被AgilityPack解析为有效的编码名称。作为Simon Mourier said,这是1.4.0.0中引入的一个错误。

要避免这种情况,请手动从流中加载文档,并手动设置编码,如下所示:

代码语言:javascript
复制
var htmlDoc = new HtmlDocument();
htmlDoc.OptionReadEncoding = false;
var request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
using (var response = (HttpWebResponse)request.GetResponse())
{
    using (var stream = response.GetResponseStream())
    {
        htmlDoc.Load(stream, Encoding.UTF8);
    }
}
票数 6
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/10151993

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档