我正在尝试从htmldocument中提取查询字符串值。它包含许多带有名为id的querystring参数的锚链。我想要在一个逗号字符串中获得所有的I。我该如何解决这个问题呢?所以我想得到: Result = {1,2,3,4,5}
vb.net代码:
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Dim str As String() = GetParagraphs(System.IO.File.ReadAllText(Server.MapPath("TextFile1.html")))
Response.Write(str)
End Sub
Private Shared Function GetParagraphs(ByVal data As String) As String()
Dim result As New List(Of String)
Dim m As Match = Regex.Match(data, "http://mywebsite.com/mydetails.aspx?id")
While (m.Success)
result.Add(m.Value)
m = m.NextMatch()
End While
Return result.ToArray()
End FunctionTextFile.html
<a href="http://mywebsite.com/mydetails.aspx?id=1"
target="_blank"></a>
<a href="http://mywebsite.com/mydetails.aspx?id=2"
target="_blank"></a>
<a href="http://mywebsite.com/mydetails.aspx?id=3"
target="_blank"></a>
<a href="http://mywebsite.com/mydetails.aspx?id=4"
target="_blank"></a>
<a href="http://mywebsite.com/mydetails.aspx?id=5"
target="_blank"></a>发布于 2011-12-05 10:20:16
您可以对您的GetParagraphs方法使用此修改:
Private Shared Function GetParagraphs(ByVal data As String) As String()
Dim result As New List(Of String)
' Define what we are looking for
Const MY_MATCH As String = "http://mywebsite.com/mydetails.aspx?id="
' Replace the ? with \? so that regex finds the correct string
Dim m As Match = Regex.Match(data, MY_MATCH.Replace("?", "\?"))
While (m.Success)
Dim wStartIndex As Integer
Dim wEndIndex As Integer
' Jump to the end of the found string
wStartIndex = m.Index + MY_MATCH.Length
' Now find the end of the href string
wEndIndex = data.IndexOf("""", wStartIndex)
' If we found something
If wEndIndex <> -1 Then
' Extract the value from the string
result.Add(data.Substring(wStartIndex, wEndIndex - wStartIndex))
End If
m = m.NextMatch()
End While
Return result.ToArray()
End Functionhttps://stackoverflow.com/questions/8380236
复制相似问题