首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >HtmlAgilityPack - SelectNodes

HtmlAgilityPack - SelectNodes
EN

Stack Overflow用户
提问于 2018-01-11 00:27:51
回答 1查看 752关注 0票数 0

我正在尝试检索一个<p class>元素。

代码语言:javascript
复制
<div class="thread-plate__details">
    <h3 class="thread-plate__title">(S) HexHunter BOW</h3>
    <p class="thread-plate__summary">created by Aazoth</p>  <!-- (THIS ONE) -->
</div>

但没那么走运。

我使用的代码如下:

代码语言:javascript
复制
' the example url to scrape
            Dim url As String = "http://services.runescape.com/m=forum/forums.ws?39,40,goto," & Label6.Text
            Dim source As String = GetSource(url)

            If source IsNot Nothing Then
                ' create a new html document and load the pages source
                Dim htmlDocument As New HtmlDocument
                htmlDocument.LoadHtml(source)

                ' Create a new collection of all href tags
                Dim nodes As HtmlNodeCollection = htmlDocument.DocumentNode.SelectNodes("//p[@class]")

                ' Using LINQ get all href values that start with http://
                ' of course there are others such as www.
                Dim links =
                    (
                        From node
                        In nodes
                        Let attribute = node.Attributes("class")
                        Where attribute.Value.StartsWith("created by ")
                        Select attribute.Value
                    )

                Me.ListBox1a.Items.AddRange(links.ToArray)
                Dim o, j As Long
                For o = 0 To ListBox1a.Items.Count - 1
                    For j = ListBox1a.Items.Count - 1 To (o + 1) Step -1
                        If ListBox1a.Items(o) = ListBox1a.Items(j) Then
                            ListBox1a.Items.Remove(ListBox1a.Items((j)))
                        End If
                    Next
                Next
                For i As Integer = 0 To Me.ListBox1a.Items.Count - 1
                    Me.ListBox1a.Items(i) = Me.ListBox1a.Items(i).ToString.Replace("created by ", "")

                Next

                For Each s As String In ListBox1a.Items
                    Dim lvi As New NetSeal.NSListView
                    lvi.Text = s
                    NsListView1.Items.Add(lvi.Text)

                Next

它运行了,但是我不能得到'created by XXX‘文本。我已经尝试了很多方法,但都没有成功,如果能帮上忙的话我会很感激的。

提前感谢大家。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-01-11 00:43:46

看起来您在attribute.Value中查找了错误的字符串。我看到的是必须将attribute.Value.StartsWith("created by ")更改为这个attribute.Value.StartsWith("thread-plate__summary")

要获取节点的内部内容,您必须这样做:Select node.InnerText

代码语言:javascript
复制
' the example url to scrape
Dim url As String = "http://services.runescape.com/m=forum/forums.ws?39,40,goto," & Label6.Text
Dim source As String = GetSource(url)

If source IsNot Nothing Then
    ' create a new html document and load the pages source
    Dim htmlDocument As New HtmlDocument
    htmlDocument.LoadHtml(source)

    ' Create a new collection of all href tags
    Dim nodes As HtmlNodeCollection = htmlDocument.DocumentNode.SelectNodes("//p[@class]")

    ' Using LINQ get all href values that start with http://
    ' of course there are others such as www.
    Dim links =
        (
            From node
            In nodes
            Let attribute = node.Attributes("class")
            Where attribute.Value.StartsWith("thread-plate__summary")
            Select node.InnerText
        )

    Me.ListBox1a.Items.AddRange(links.ToArray)
    Dim o, j As Long
    For o = 0 To ListBox1a.Items.Count - 1
        For j = ListBox1a.Items.Count - 1 To (o + 1) Step -1
            If ListBox1a.Items(o) = ListBox1a.Items(j) Then
                ListBox1a.Items.Remove(ListBox1a.Items((j)))
            End If
        Next
    Next
    For i As Integer = 0 To Me.ListBox1a.Items.Count - 1
        Me.ListBox1a.Items(i) = Me.ListBox1a.Items(i).ToString.Replace("created by ", "")

    Next

    For Each s As String In ListBox1a.Items
        Dim lvi As New NetSeal.NSListView
        lvi.Text = s
        NsListView1.Items.Add(lvi.Text)

    Next

我希望这对你有用。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48191940

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档