首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用PHP简单的HTML DOM解析器查找带有类的div及其纯文本

使用PHP简单的HTML DOM解析器查找带有类的div及其纯文本
EN

Stack Overflow用户
提问于 2018-05-05 15:15:04
回答 2查看 1.4K关注 0票数 1

我想找到ft00类在工作经验教育和培训之间,并提取包含给定html日期的类文本。

代码语言:javascript
复制
<p class = "ft00">Introduction</p>
<p class = "ft00">John Smith</p>
<p class = "ft02">Email:</p>
<p class = "ft00">John@gmail.com</p>
<p class = "ft00">Work Experience</p>
<p class = "ft00">27 July 2017</p>
<p class = "ft02">ABC Company</p>
<p class = "ft00">19 May 2018</p>
<p class ="ft02">XYZ Company</p>
<p class = "ft00">EDUCATION AND TRAINING</p>

到目前为止,我可以得到的是提取、工作经验、教育和培训之间的所有数据,它正在正常工作,代码如下:

代码语言:javascript
复制
$fexp = $html->find('p[plaintext^=Work Experience]');
$items = array();
 foreach ($fexp as $keye) {

    while ( $keye->nextSibling() ) {
        if ( $keye->nextSibling() == TRUE ) {

         $keye = $keye->nextSibling();
            $varce = $keye->plaintext;



        }
        if ( trim($varce) == "EDUCATION AND TRAINING" ){
            break;
        }
        //$test[] = $collection;
       $items[] = $varce;
        // echo $varce;

}
}
var_dump($items);

我离得很近,但似乎找不到解决办法,任何帮助都将不胜感激:)

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-05-05 17:04:27

以下是正确的工作守则:-

代码语言:javascript
复制
$test = array();
$matching  = false;
$collection = $html->find('p.ft00');
foreach ($collection as $tkey) {
    if ($tkey->plaintext == "WORK EXPERIENCE" || $matching ) {
        $test[] = $tkey->plaintext;
        $matching = true;
    }
    if ( $tkey->plaintext == "EDUCATION AND TRAINING") {
        break;
    }

    }
    var_dump($test);    

产出:-

代码语言:javascript
复制
Array
(
    [0] => Work Experience
    [1] => 27 July 2017
    [2] => 19 May 2018
    [3] => EDUCATION AND TRAINING
)
票数 1
EN

Stack Overflow用户

发布于 2018-05-05 15:38:11

使用DOMDocumentDOMXPath,您可以像下面这样做,我从未使用过简单的HTML,但我假设它有XPath。

代码语言:javascript
复制
<?php
$dom = new DOMDocument();

$dom->loadHtml('
<p class = "ft00">Introduction</p>
<p class = "ft00">John Smith</p>
<p class = "ft02">Email:</p>
<p class = "ft00">John@gmail.com</p>
<p class = "ft00">Work Experience</p>
<p class = "ft00">27 July 2017</p>
<p class = "ft02">ABC Company</p>
<p class = "ft00">19 May 2018</p>
<p class ="ft02">XYZ Company</p>
<p class = "ft00">EDUCATION AND TRAINING</p>
', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

$xpath = new DOMXPath($dom);

$result = [];
$matching  = false;
foreach ($xpath->query("//p[contains(@class, 'ft00') or contains(@class, 'ft02')]/text()") as $p) {
    if ($p->nodeValue === 'Work Experience' || $matching) {
        $result[] = $p->nodeValue;
        $matching = true;
    }
    if ($p->nodeValue === 'EDUCATION AND TRAINING') {
        break;
    }
}

print_r($result);

结果:

代码语言:javascript
复制
Array
(
    [0] => Work Experience
    [1] => 27 July 2017
    [2] => ABC Company
    [3] => 19 May 2018
    [4] => XYZ Company
    [5] => EDUCATION AND TRAINING
)

https://3v4l.org/0nvr4

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/50190956

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档