我一直在使用PHP的Domdocument扩展来查找没有alt属性或alt属性为空的图像标记。下面是我用于测试的html代码:
<span style="font-weight:bold;">Blender</span> is an Open Source 3D modelling and animation software.
This is a very popular software among hobbyists.<i>Blender</i> has a vast list of features which include bones and meshing, textures, particle physics etc.
<u>Blender</u> was originally a proprietary software which was eventually made opensource.
Blender is known to be difficult to learn because its interface is very intimiding to a newbie.
But on the other hand, <a href="http://www.blender.org">Blender</a> is so much customizable that you can actually modify your workspace according to your personal preference.
Also blender interface has been developed in the OpenGL graphics library, so blender looks all the same on all platforms whether you use Windows, Linux, BSD or even Mac.
3D is a very interesting field to work with but 3D is somewhat tough to start with. You can <a href="http://www.google.com"" target="_blank">Google</a> for numerous tutorials on Blender.
There are quite some awesome websites dedicated to blender development, such as BlenderGuru.com. <img src="http://www.cochinsquare.com/wp-content/uploads/2010/08/Blender.jpg">下面是我用来搜索IMG标记并向其添加alt属性的Domdocument代码。
$dom=new DOMDocument();
$dom->loadHTML($content);
$dom->formatOutput = true;
$imgs = $dom->getElementsByTagName("img");
foreach($imgs as $img){
$alt = $img->getAttribute('alt');
if ($alt == ''){
$k_alt = $this->keyword;
}else{
$k_alt = $alt;
}
$img->setAttribute( 'alt' , $k_alt );
}
$html_mod = preg_replace('/^<!DOCTYPE.+?>/', '', str_replace( array('<html>', '</html>', '<body>', '</body>'), array('', '', '', ''), $dom->saveHTML()));
return $html_mod;这是我得到的html。
<span style='"font-weight:bold;"'>Blender</span> is an Open Source 3D modelling and animation software.
This is a very popular software among hobbyists.<i>Blender</i> has a vast list of features which include bones and meshing, textures, particle physics etc.
<u>Blender</u> was originally a proprietary software which was eventually made opensource.
Blender is known to be difficult to learn because its interface is very intimiding to a newbie.
But on the other hand, <a href=""http://www.blender.org"">Blender</a> is so much customizable that you can actually modify your workspace according to your personal preference.
Also blender interface has been developed in the OpenGL graphics library, so blender looks all the same on all platforms whether you use Windows, Linux, BSD or even Mac.
3D is a very interesting field to work with but 3D is somewhat tough to start with. You can <a href=""http://www.google.com""" target='"_blank"'>Google</a> for numerous tutorials on Blender.
There are quite some awesome websites dedicated to blender development, such as BlenderGuru.com.
<img src=""http://www.cochinsquare.com/wp-content/uploads/2010/08/Blender.jpg"" alt="Blender">观察img src和锚标记以及span的style属性中的额外引号(单引号和双引号)。
请帮帮我!我希望返回的html原封不动,只添加了新的alt属性。
我还想提一下,我在Ubuntu 10.04上使用的是PHP 5.3.2和Suhosin补丁
发布于 2021-07-29 01:09:59
我终于想出了如何解决这个问题,并想与你分享我的解决方案,以避免在saveHtml后添加引号,你应该对saveHTML函数的结果使用html_entity_decode,例如:
$filecontent = file_get_contents('file.html');
$doc = new DOMDocument();
$doc->loadHTML($filecontent);
$xpath = new DOMXpath($doc);
$xpath->query("//*[id='bg']")[0]->nodeValue = 'asd';
$filecontent = html_entity_decode($doc->saveHTML());
file_put_contents('file.html', $file_contents);所以你会在$filecontent变量中得到正确的html代码,没有多余的引号,欢迎你!
https://stackoverflow.com/questions/7117214
复制相似问题