我试图解析来自请求的数据,以便将链接结果添加到列表框中。这是我正在尝试拆分的html代码。
<div class="rc" data-hveid="411"><h3 class="r"><a href="http://google.com/" onmousedown="return rwt
<div class="rc" data-hveid="48"><h3 class="r"><a href="http://google2.com/" onmousedown="return rwt只是个例子。他们很多..。
这是我的密码。它有效,但不正确。
Dim request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("https://www.google.ro/search?q=Google")
Dim response As System.Net.HttpWebResponse = request.GetResponse
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
Dim rssourcecode As String = sr.ReadToEnd
Dim pp As String = rssourcecode
Dim strRegex As String = "><a href="".*"""
Dim myRegex As New Regex(strRegex, RegexOptions.None)
For Each myMatch As Match In myRegex.Matches(pp)
If myMatch.Success Then
ListBox1.Items.Add(myMatch.Value.Split("""").GetValue(1))
End If
Next这是输出:http://prntscr.com/9u000g/direct
帮帮我求你了!我只是想得到的第一个5-6网站链接,谷歌显示在第一页。
示例:https://www.google.com/search?q=Google
输出: 1. https://www.google.com/
发布于 2016-01-24 16:39:54
据我理解,您希望在变量rssourcecode中获得任何链接,这意味着(href=")和(")之间的任何东西。
尝试使用以下代码:
Dim request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("https://www.google.ro/search?q=Google")
Dim response As System.Net.HttpWebResponse = request.GetResponse
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
Dim rssourcecode As String = sr.ReadToEnd
Dim MC As MatchCollection = Regex.Matches(rssourcecode, "href=""(.*?)""")
For i = 0 To MC.Count - 1
MsgBox(MC(i).Groups(1).Value)
Next编辑:您可以使用此模式获取(/url?q=)和(&)之间的任何内容
/url\?q=(.*?)&
两者之间有\标记?因为"?“是一个特殊的正则表达式符号,您可以通过将\放在它前面来转义特殊的符号。
https://stackoverflow.com/questions/34970662
复制相似问题