我正在寻找away从HTML中提取特定类型的URL,如下所示:此处的唯一标识符是data-spec-code下的值,例如PROROC & KROROC。
<section data-spec-code="PROROC" only-child="">
<div class="test-class">
<div only-child="" class=" col-sm-4 col-md-3 col-lg-3 show-top-border hidden-xs tech-spec-title-container stack-0">
<div class="test-class-title">
<h5 class="top-offset-10 bottom-offset-0 force-bold-font"><span>Data </span></h5>
</div>
</div>
<div only-child="" class=" col-sm-4 col-md-3 col-lg-3 show-top-border hidden-xs tech-spec-title-container stack-1">
<div class="test-class-title">
<!----><em> <span class="hidden">Data </span></em>
</div>
</div>
<div only-child="" class=" col-sm-4 col-md-3 col-lg-3 show-top-border hidden-xs tech-spec-title-container stack-2">
<div class="test-class-title">
<!----><em> <span class="hidden">Data </span></em>
<!----><small class="help-me-choose-link helpmechoosestyle"><a href="//www.url-i-want-to-extract.com" target="_blank">URL 1</a></small></div>
</div>
</div>
</section>
<section data-spec-code="KROROC" only-child="">
<div class="test-class">
<div only-child="" class=" col-sm-4 col-md-3 col-lg-3 show-top-border hidden-xs tech-spec-title-container stack-0">
<div class="test-class-title">
<h5 class="top-offset-10 bottom-offset-0 force-bold-font"><span>Data 2</span></h5>
</div>
</div>
<div only-child="" class=" col-sm-4 col-md-3 col-lg-3 show-top-border hidden-xs tech-spec-title-container stack-1">
<div class="test-class-title">
<!----><em> <span class="hidden">Data 2</span></em>
</div>
</div>
<div only-child="" class=" col-sm-4 col-md-3 col-lg-3 show-top-border hidden-xs tech-spec-title-container stack-2">
<div class="test-class-title">
<!----><em> <span class="hidden">Data 2</span></em>
<!----><small class="help-me-choose-link helpmechoosestyle"><a href="//www.2nd-url-i-want-to-extract.com" target="_blank">URL 2</a></small></div>
</div>
</div>
</section>
我做了一个基于stackoverflow和谷歌的研究的代码,但我只能从页面或getElementsBy中提取所有链接。
我无法使用这些选项,因为超链接嵌套在另一个标记中,并且页面有太多的超链接。我也尝试过querySelector,但失败了。
我希望我能从你们所有人那里得到一些关于如何实现这一点的建议/指导。
以下是我的预期结果:
PROROC www.url-i-want-to-extract.com
KROROC www.Second-url-i-want-to-Extract.com
发布于 2019-09-19 02:16:02
除了您对它的描述之外,还可以帮助您查看实际代码。
您可以从属性选择器开始,以那些attribute=value对为目标元素,并抓取子a标记
Dim i As Long
With ie.document.querySelectorAll("[data-spec-code=PROROC] a, [data-spec-code=KROROC] a")
For i = 0 To .Length - 1
Debug.Print .Item(i).href
Next
End Withhttps://stackoverflow.com/questions/57996526
复制相似问题