从HTML:https://www.topazlabs.com/downloads中,我希望将ReMask版本号提取为string: v5.0.1
就像这样;
->finder->query("//div[contains(@class, 'wpb_wrapper')]/.//a[text()[contains(.,'Topaz ReMask')]]/../../../div");
OR
...->finder->query("//div[contains(@class, 'wpb_wrapper')]//a[text()[contains(.,'Topaz ReMask')]]/../../../div");它能工作,但能简化吗?
我使用的页面的HTML部分如下:
...
<div class="wpb_wrapper">
<div class="vc_empty_space" style="height: 20px">
<span class="vc_empty_space_inner">
</span>
</div>
<div id="mpc_textblock-975b2251c2a82c7" class="mpc-textblock mpc-init mpc-typography--preset_2 ">
<p>
<a href="/remask" target="blank">Topaz ReMask</a>
</p>
</div>
<div class="mpc-tooltip-wrap" data-id="mpc_textblock-615b2251c2a8c4a">
<div id="mpc_textblock-615b2251c2a8c4a" class="mpc-textblock mpc-init mpc-typography--preset_0 ">
<p>
<em>v5.0.3 (Mac) / v5.0.1 (Win)
</em>
</p>
</div>
<div id="mpc_tooltip-925b2251c2a8d2f" class="mpc-tooltip mpc-init mpc-typography--preset_4 mpc-position--left mpc-can-hover mpc-trigger--hover ">Mac Updated November 4, 2016
<br>Windows Updated November 21, 2016
<div class="mpc-arrow">
</div>
</div>
</div>
<div id="mpc_textblock-475b2251c2a9601" class="mpc-textblock mpc-init ">
<p>The quickest and easiest way to mask your photo.
</p>
</div>
</div>
...发布于 2018-08-02 08:55:55
你可以把它建立在文本内容的基础上。使用DOMXpath::evaluate(),您可以直接获取字符串:
$document= new DOMDocument();
$document->loadHTML($html);
$xpath = new DOMXpath($document);
$expression = "substring-after(
//div[contains(.//p, 'Topaz ReMask')]//text()[starts-with(., 'Windows Updated ')],
'Windows Updated '
)";
var_dump($xpath->evaluate($expression));输出:
string(24) "November 21, 2016
"Xpath表达式
div的p,.
//div[contains(.//p, 'Topaz ReMask')]Windows Updated开始的文本后代节点.
//div[contains(.//p, 'Topaz ReMask')]//text()[starts-with(., 'Windows Updated ')]Windows Updated之后提取文本 substring-after(
//div[contains(.//p, 'Topaz ReMask')]//text()[starts-with(., 'Windows Updated ')],
'Windows Updated '
)https://stackoverflow.com/questions/50856755
复制相似问题