首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >DomCrawler仅选择段落

DomCrawler仅选择段落
EN

Stack Overflow用户
提问于 2017-10-17 12:44:18
回答 1查看 389关注 0票数 0

我只想在.listjawaban类之前使用DomCrawler/Goutte Symfony组件提取每个.pertanyaan类中的段落

有没有办法做到这一点?我将使用$crawler->filter('.pertanyaan p')->eq($i)->html(),但它只给了我第一段,因为$i.pertanyaan类的第n个位置。

代码语言:javascript
复制
<div class="pertanyaan"><p></p>
<p>Karena mengalami mutasi, kromosom mengalami perubahan seperti pada gambar di bawah.</p>
<p><img src="http://indocademy.com/images/ipa_2013_133/53_1.png" alt=""><br>Jenis mutasi tersebut adalah ....</p>
<p></p>
<div class="listjawaban">
<div class="radiojawaban">
    <input type="radio" name="answer_dup_758" id="answer_dup_758_A" value="A" style="display:none" disabled=""><input type="radio" name="answer_758" id="answer_758_A" value="A" onclick="showbutton(758);">A. 
</div>
<div class="pilihanjawaban">
    adisi
</div>
</div>
<div class="listjawaban">
<div class="radiojawaban">
    <input type="radio" name="answer_dup_758" id="answer_dup_758_B" value="B" style="display:none" disabled=""><input type="radio" name="answer_758" id="answer_758_B" value="B" onclick="showbutton(758);">B. 
</div>
<div class="pilihanjawaban">
    delesi
</div>
</div>
<div class="listjawaban">
<div class="radiojawaban">
    <input type="radio" name="answer_dup_758" id="answer_dup_758_C" value="C" style="display:none" disabled=""><input type="radio" name="answer_758" id="answer_758_C" value="C" onclick="showbutton(758);">C. 
</div>
<div class="pilihanjawaban">
    inversi
</div>
</div>
<div class="listjawaban">
<div class="radiojawaban">
    <input type="radio" name="answer_dup_758" id="answer_dup_758_D" value="D" style="display:none" disabled=""><input type="radio" name="answer_758" id="answer_758_D" value="D" onclick="showbutton(758);">D. 
</div>
<div class="pilihanjawaban">
    duplikasi
</div>
</div>
<div class="listjawaban">
<div class="radiojawaban">
    <input type="radio" name="answer_dup_758" id="answer_dup_758_E" value="E" style="display:none" disabled=""><input type="radio" name="answer_758" id="answer_758_E" value="E" onclick="showbutton(758);">E. 
</div>
<div class="pilihanjawaban">
    translokasi
</div>
</div>

<div class="buttons">
<input type="button" class="tombol_jawab" id="tombol_jawab_758" value="Jawab" style="display:none" onclick="executejawaban(758,&quot;http://indocademy.com&quot;)"><input type="button" class="tombol_clear" id="tombol_clear_758" value="Hapus" style="display:none" onclick="clearjawaban(758)">
</div>

<div class="kunci" id="kunci_758" style="display: none">
<div class="tulisanjawab abu">
<input type="button" id="tombol_kunci" value="+" class="jawaban_758" onclick="showkunci(this)">
Jawaban : <img id="loading_758" src="http://indocademy.com/images/loading.gif" style="height:12px;vertical-align:middle">
<span id="hasil_758"> </span>
</div>
<div class="konten_kunci">
<div class="konten_jawaban_758" id="isi_jawaban"></div>
</div>
</div>
</div>

这是我想要抓取的网址:http://indocademy.com/soal/sbmptn/biologi/2013

除了爬行之外,一切都很顺利,但在#53,因为有三个段落标签要提取(我只是假设每个数字都有它的第一个段落标签作为问题,并且我不知道如何提取.listjawaban类之前的所有段落)

请帮帮忙

EN

回答 1

Stack Overflow用户

发布于 2017-11-04 11:19:16

由于URL处的页面不具有该结构,并且类.pertanyaan也不存在,因此我将HTML代码段复制到一个脚本中,并使用DomCrawler获取四个

元素。

代码语言:javascript
复制
#!/usr/bin/php

<?php

require ('vendor/autoload.php');

use Symfony\Component\DomCrawler\Crawler;

$html = <<<'HTML'
<div class="pertanyaan">
    <p></p>
    <p>Karena mengalami mutasi, kromosom mengalami perubahan seperti pada gambar di bawah.</p>
    <p><img src="http://indocademy.com/images/ipa_2013_133/53_1.png" alt=""><br>Jenis mutasi tersebut adalah ....</p>
    <p></p>
    <div class="listjawaban">
        <div class="radiojawaban">
            <input type="radio" name="answer_dup_758" id="answer_dup_758_A" value="A" style="display:none" disabled="">
            <input type="radio" name="answer_758" id="answer_758_A" value="A" onclick="showbutton(758);">A.
        </div>
        <div class="pilihanjawaban">
            adisi
        </div>
    </div>
    <div class="listjawaban">
        <div class="radiojawaban">
            <input type="radio" name="answer_dup_758" id="answer_dup_758_B" value="B" style="display:none" disabled="">
            <input type="radio" name="answer_758" id="answer_758_B" value="B" onclick="showbutton(758);">B.
        </div>
        <div class="pilihanjawaban">
            delesi
        </div>
    </div>
    <div class="listjawaban">
        <div class="radiojawaban">
            <input type="radio" name="answer_dup_758" id="answer_dup_758_C" value="C" style="display:none" disabled="">
            <input type="radio" name="answer_758" id="answer_758_C" value="C" onclick="showbutton(758);">C.
        </div>
        <div class="pilihanjawaban">
            inversi
        </div>
    </div>
    <div class="listjawaban">
        <div class="radiojawaban">
            <input type="radio" name="answer_dup_758" id="answer_dup_758_D" value="D" style="display:none" disabled="">
            <input type="radio" name="answer_758" id="answer_758_D" value="D" onclick="showbutton(758);">D.
        </div>
        <div class="pilihanjawaban">
            duplikasi
        </div>
    </div>
    <div class="listjawaban">
        <div class="radiojawaban">
            <input type="radio" name="answer_dup_758" id="answer_dup_758_E" value="E" style="display:none" disabled="">
            <input type="radio" name="answer_758" id="answer_758_E" value="E" onclick="showbutton(758);">E.
        </div>
        <div class="pilihanjawaban">
            translokasi
        </div>
    </div>
    <div class="buttons">
        <input type="button" class="tombol_jawab" id="tombol_jawab_758" value="Jawab" style="display:none" onclick="executejawaban(758,&quot;http://indocademy.com&quot;)"><input type="button" class="tombol_clear" id="tombol_clear_758" value="Hapus" style="display:none"
          onclick="clearjawaban(758)">
    </div>

    <div class="kunci" id="kunci_758" style="display: none">
        <div class="tulisanjawab abu">
            <input type="button" id="tombol_kunci" value="+" class="jawaban_758" onclick="showkunci(this)"> Jawaban : <img id="loading_758" src="http://indocademy.com/images/loading.gif" style="height:12px;vertical-align:middle">
            <span id="hasil_758"> </span>
        </div>
        <div class="konten_kunci">
            <div class="konten_jawaban_758" id="isi_jawaban"></div>
        </div>
    </div>
</div>
HTML;

$crawler = new Crawler($html);

$output = $crawler->filter('.pertanyaan p')->each(function ($node) {
    return $node->html();
});

print_r($output);

函数each()返回一个包含四个段落的数组。生成的数组如下:

代码语言:javascript
复制
Array
(
    [0] =>
    [1] => Karena mengalami mutasi, kromosom mengalami perubahan seperti pada gambar di bawah.
    [2] => <img src="http://indocademy.com/images/ipa_2013_133/53_1.png" alt=""><br>Jenis mutasi tersebut adalah ....
    [3] =>
)
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/46782627

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档