我有一个包含多个<html><body><div>Content</div></body></html>标记的字符串。我希望将所有内容连接到一个有效的结构中。例如:
<html><body><div>Content</div></body></html>
<html><body><div>Content</div></body></html>
<html><body><div>Content</div></body></html>应:
<html>
<body>
<div>Content</div>
<div>Content</div>
<div>Content</div>
</body>
</html>我目前的代码如下:
libxml_use_internal_errors(true);
$newDom = new DOMDocument();
$newBody = "";
$newDom->loadHTML(mb_convert_encoding($html, 'HTML-ENTITIES', 'UTF-8'));
$bodyTags = $newDom->getElementsByTagName("body");
foreach($bodyTags as $body) {
$newBody .= $newDom->saveHTML($body);
}$newBody现在包含所有的正文标记:
<body><div>Content</div></body>
<body><div>Content</div></body>
<body><div>Content</div></body>如何在$newBody中只保存每个主体标记的HTML内容?
编辑:
基于@NigelRen的回答,这是我的解决方案:
libxml_use_internal_errors(true);
$newDom = new DOMDocument();
$newBody = '';
$newDom->loadHTML(mb_convert_encoding($html, 'HTML-ENTITIES', 'UTF-8'));
$bodyTags = $newDom->getElementsByTagName("body");
foreach($bodyTags as $body) {
foreach ($body->childNodes as $node) {
$newBody .= $newDom->saveHTML($node);
}
}
$newDom = new DOMDocument();
$newDom->loadHTML(mb_convert_encoding($newBody, 'HTML-ENTITIES', 'UTF-8'));
$newBody = $newDom->saveHTML();发布于 2020-02-11 07:38:40
这很尴尬,因为当您使用loadHTML()时,它将尝试修复原始文档中的HTML。这创造了一种结构,而不是你想象的那样。
但是,如果您有该文档的基本大纲,下面的内容将将<body>标记的内容复制到一个新文档中(代码中的注释).
$html = '<html><body><div>Content1</div></body></html>
<html><body><div>Content2</div></body></html>
<html><body><div>Content3</div></body></html>';
libxml_use_internal_errors(true);
$newDom = new DOMDocument();
// New document with final code
$newBody = new DOMDocument();
$newDom->loadHTML(mb_convert_encoding($html, 'HTML-ENTITIES', 'UTF-8'));
// Set up basic template for new doucument
$newBody->loadHTML("<html><body /></html>");
// Find where to add any new content
$addBody = $newBody->getElementsByTagName("body")[0];
// Find the existing content to add
$bodyTags = $newDom->getElementsByTagName("body");
foreach($bodyTags as $body) {
// Add all of the contents of the <body> tag into the new document
foreach ( $body->childNodes as $node ) {
// Import the node to copy to the new document and add it in
$addBody->appendChild($newBody->importNode($node, true));
}
}
echo $newBody->saveHTML();这给了..。
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><div>Content1</div><div>Content2</div><div>Content3</div></body></html>限制是<body>标记之外的任何内容和<body>标记的任何属性都不会被保留。
发布于 2020-02-11 07:14:46
您可以通过将html代码放入php代码来做到这一点。您可以这样编写代码:
<?php
echo '<html><body><div>Content</div></body></html>';
*PHP code to be executed...*
?>https://stackoverflow.com/questions/60163537
复制相似问题