如果a或<href>不以www、http或<a>或<img>标记开头,我希望从$string_1中删除特定的<src>和<href>标记。
例如,通过删除以下内容将$string_1转换为$string_2:
<img src="/wp-content/uploads/2014/06/photography-business-2.jpg" alt="photography business growth 1 650x430 6 Simple Ways To Help Grow Your Photography Business" width="650" height="430" class="alignnone size-large wp-image-609513" title="6 Simple Ways To Help Grow Your Photography Business"/>和
<a href="/photography-business-growth/" rel="nofollow">Read more about Photography Business Growth ></a>因为src和href标记不是以http、https或www开头的。
$string_1 = '
<div class="mainpost"><p><img src="/wp-content/uploads/2014/06/photography-business-2.jpg" alt="photography business growth 1 650x430 6 Simple Ways To Help Grow Your Photography Business" width="650" height="430" class="alignnone size-large wp-image-609513" title="6 Simple Ways To Help Grow Your Photography Business"/></p>
<div class="mainpost"><p><img src="http://www.domain.com/wp-content/uploads/2014/06/photography-business-2.jpg" alt="photography business growth 1 650x430 6 Simple Ways To Help Grow Your Photography Business" width="650" height="430" class="alignnone size-large wp-image-609513" title="6 Simple Ways To Help Grow Your Photography Business"/></p>
<p><a href="http://domain.com/photography-business-growth/" rel="nofollow">Read more about Photography Business Growth ></a></p>
<p>Photography Business Growth | With a world wide recession, photographers and small business owners are forced, more than ever, to think creatively, to think differently and outside of the box. With very little or no money to invest in your business, can you move forward? How can you build your brand and make sure to get happier, paying clients through your door?<br/><span id="more-609494"/></p>
<p>If you take good shots it doesn’t mean you’ll gain success and popularity among customers. For those of you who have survived start=up and built successful brands, you may be wondering which step to take next to grow your business beyond its current status. There are numerous possibilities, some of which we’ll outline here. You need to know how to sell yourself well! Everything is quite simple and you can do it yourself.</p>
<p><a href="/photography-business-growth/" rel="nofollow">Read more about Photography Business Growth ></a></p>
';
$string_2= '
<div class="mainpost"><p></p>
<div class="mainpost"><p><img src="http://www.domain.com/wp-content/uploads/2014/06/photography-business-2.jpg" alt="photography business growth 1 650x430 6 Simple Ways To Help Grow Your Photography Business" width="650" height="430" class="alignnone size-large wp-image-609513" title="6 Simple Ways To Help Grow Your Photography Business"/></p>
<p><a href="http://domain.com/photography-business-growth/" rel="nofollow">Read more about Photography Business Growth ></a></p>
<p>Photography Business Growth | With a world wide recession, photographers and small business owners are forced, more than ever, to think creatively, to think differently and outside of the box. With very little or no money to invest in your business, can you move forward? How can you build your brand and make sure to get happier, paying clients through your door?<br/><span id="more-609494"/></p>
<p>If you take good shots it doesn’t mean you’ll gain success and popularity among customers. For those of you who have survived start=up and built successful brands, you may be wondering which step to take next to grow your business beyond its current status. There are numerous possibilities, some of which we’ll outline here. You need to know how to sell yourself well! Everything is quite simple and you can do it yourself.</p>
';你能帮我解决这个问题吗?谢谢
发布于 2014-06-24 16:11:34
这里是PHP的第一种方法。它适用于示例数据。在$string_2是尾声“失踪”。
$string_3 = $string_1;
$pattern = "([^wh]|w[^w]|ww[^w]|h[^t]|ht[^t]|htt[^p])";
$string_3 = preg_replace("/<img src=\"".$pattern."[^>]*>/","",$string_3);
$string_3 = preg_replace("/<a href=\"".$pattern."[^>]*>[^<]*<\/a>/","",$string_3);发布于 2014-06-24 16:16:22
为此,我将使用DOM解析器。有了DOM文档,您可以使用XPath来选择所需的元素。
# Parse the HTML snippet into a DOM document
$doc = new DOMDocument();
$doc->loadHTML($string_1);
# Create an XPath selector
$selector = new DOMXPath($doc);
# Define the XPath query
# The syntax highlighter messed this up. Take it as it is!
$query = <<<EOF
//a[not(starts-with(@href, "http"))
and not(starts-with(@href, "www"))]
| //img[not(starts-with(@src, "http"))
and not(starts-with(@src, "www"))]
EOF;
# Issue the XPath query and remove every resulting node
foreach($selector->query($query) as $node) {
$node->parentNode->removeChild($node);
}
# Write back the modified `<div>` element into a string
echo $doc->saveHTML(
$selector->query('//div[@class="mainpost"]')->item(0)
);发布于 2014-06-24 15:57:01
一种解决方案是在前端使用Javascript进行此操作。如果这不是一个选项,您可以查看一个PHP库来解析和遍历DOM,例如http://simplehtmldom.sourceforge.net
https://stackoverflow.com/questions/24390969
复制相似问题