我正在尝试从php中的多个图像字符串中获取data-src和data-srcset属性。这两个属性都是可选的,这意味着可以有零,只有data-src,只有data-srcset或两者都有。我的准则是
<img(.*?)data-src=['\"](.*?)['\"].*?|(data-srcset=['\"](.*?)['\"])?\/>
我测试的字符串是:
<li class="blocks-gallery-item">
<figure>
<img data-src="http://localhost:3000/wp-content/uploads/2018/11/detektivhut.gif" alt="" data-id="1037" data-link="http://localhost:3000/detektivhut/" class="wp-image-1037"/>
</figure>
</li>
<li class="blocks-gallery-item">
<figure>
<img data-src="http://localhost:3000/wp-content/uploads/2018/11/DSC04828.png" alt="" data-id="948" data-link="http://localhost:3000/dsc04828-2/" class="wp-image-948" data-srcset="//localhost:3000/wp-content/uploads/2018/11/DSC04828.png 1067w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-200x300.png 200w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-768x1152.png 768w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-683x1024.png 683w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-1000x1500.png 1000w" sizes="(max-width: 1067px) 100vw, 1067px" />
</figure>
</li>
<li class="blocks-gallery-item">
<figure>
<img data-src="http://localhost:3000/wp-content/uploads/2018/11/DSC04831.png" alt="" data-id="883" data-link="http://localhost:3000/2018/11/13/single-page-style-1/dsc04831-2/" class="wp-image-883" data-srcset="//localhost:3000/wp-content/uploads/2018/11/DSC04831.png 1067w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-200x300.png 200w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-768x1152.png 768w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-683x1024.png 683w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-1000x1500.png 1000w" sizes="(max-width: 1067px) 100vw, 1067px" />
</figure>
</li>但它太贪婪了。看这里:
https://regex101.com/r/vDQE3C/1
任何帮助(也是合乎逻辑的)都非常感谢。
发布于 2018-12-04 15:32:27
不要使用regex来解析html代码。最好像这样使用DOM解析器:
$html = <<< EOF
<li class="blocks-gallery-item">
<figure>
<img data-src="http://localhost:3000/wp-content/uploads/2018/11/detektivhut.gif" alt="" data-id="1037" data-link="http://localhost:3000/detektivhut/" class="wp-image-1037"/>
</figure>
</li>
<li class="blocks-gallery-item">
<figure>
<img data-src="http://localhost:3000/wp-content/uploads/2018/11/DSC04828.png" alt="" data-id="948" data-link="http://localhost:3000/dsc04828-2/" class="wp-image-948" data-srcset="//localhost:3000/wp-content/uploads/2018/11/DSC04828.png 1067w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-200x300.png 200w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-768x1152.png 768w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-683x1024.png 683w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-1000x1500.png 1000w" sizes="(max-width: 1067px) 100vw, 1067px" />
</figure>
</li>
<li class="blocks-gallery-item">
<figure>
<img data-src="http://localhost:3000/wp-content/uploads/2018/11/DSC04831.png" alt="" data-id="883" data-link="http://localhost:3000/2018/11/13/single-page-style-1/dsc04831-2/" class="wp-image-883" data-srcset="//localhost:3000/wp-content/uploads/2018/11/DSC04831.png 1067w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-200x300.png 200w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-768x1152.png 768w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-683x1024.png 683w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-1000x1500.png 1000w" sizes="(max-width: 1067px) 100vw, 1067px" />
</figure>
</li>
EOF;
$xpath = new DOMXPath(@DOMDocument::loadHTML($html));
$images = $xpath->evaluate("//img");
foreach($images as $img){
if (($el = $img->attributes->getNamedItem('data-src')) != null)
echo 'data-src=' . $el->nodeValue . "\n";
if (($el = $img->attributes->getNamedItem('data-srcset')) != null)
echo 'data-srcset=' . $el->nodeValue . "\n";
}输出:
data-src=http://localhost:3000/wp-content/uploads/2018/11/detektivhut.gif
data-src=http://localhost:3000/wp-content/uploads/2018/11/DSC04828.png
data-srcset=//localhost:3000/wp-content/uploads/2018/11/DSC04828.png 1067w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-200x300.png 200w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-768x1152.png 768w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-683x1024.png 683w, //localhost:3000/wp-content/uploads/2018/11/DSC04828-1000x1500.png 1000w
data-src=http://localhost:3000/wp-content/uploads/2018/11/DSC04831.png
data-srcset=//localhost:3000/wp-content/uploads/2018/11/DSC04831.png 1067w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-200x300.png 200w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-768x1152.png 768w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-683x1024.png 683w, //localhost:3000/wp-content/uploads/2018/11/DSC04831-1000x1500.png 1000w发布于 2018-12-04 15:34:18
您只需要说明data-attributes*和图像结束标记/>之间的任何内容。你需要另一个(.*?)。
<img(.*?)data-src=['\"](.*?)['\"].*?data-srcset=['\"](.*?)['\"](.*?)\/>
如果您只想捕获data-attributes*,请考虑使用非捕获组,如下所示。因此,$1和$2变量只包含所需的数据,而不是整个图像标记。
<img(?:.*?)data-src=['\"](.*?)['\"].*?data-srcset=['\"](.*?)['\"](?:.*?)\/>
https://stackoverflow.com/questions/53615963
复制相似问题