首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >漂亮的汤查找和导航HTML

漂亮的汤查找和导航HTML
EN

Stack Overflow用户
提问于 2019-02-22 15:21:16
回答 2查看 129关注 0票数 0

我想从本站上拿出时间表。

特别是,我希望文本包含在

代码语言:javascript
复制
div #tabs-4 > h3 > a > span 

我尝试过这样做,但它只返回第一项,而不返回项目下的完整树。够疯狂的了,这个网站四次使用#tabs-4

代码语言:javascript
复制
departures_table = soup.select('#tabs-4')
 for div in alilauro_departures_table:
            span = div.select('span')
            alilauro_timetable.append({
                "COMPANY": span[2].text,
                "DEPARTURE DATE TIME" : span[0].text,
                "ARRIVAL DATE TIME": span[4].text,
                "ITINERARIO": span[1].text,
                "FERRY NAME": span[3].text
            })
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-02-22 16:05:14

在下面尝试一下,code.You不需要选择#tab,因为您已经在使用url链接。

代码语言:javascript
复制
import bs4
import re
import requests
html_doc=requests.get("https://alilauronew.forth-crs.gr/italian_b2c/npgres.exe?func=TT&tripcount=1&StartDateLeg1=22%2F02%2F2019&StartDateLeg2=22%2F02%2F2019&StartDateLeg3=22%2F02%2F2019&StartDateLeg4=22%2F02%2F2019&Leg1ilabel=NAPOLI%28BEVERELLO%29&Leg1i=BEV&Leg1iilabel=ISCHIA&Leg1ii=ISH&Leg1Date=22%2F02%2F2019&Leg2ilabel=ISCHIA&Leg2i=ISH&Leg2iilabel=NAPOLI%28BEVERELLO%29&Leg2ii=BEV&Leg2Date=22%2F02%2F2019&Leg3ilabel=NAPOLI%28BEVERELLO%29&Leg3i=BEV&Leg3iilabel=FORIO&Leg3ii=FRD&Leg3Date=22%2F02%2F2019&Leg4ilabel=FORIO&Leg4i=FRD&Leg4iilabel=NAPOLI%28BEVERELLO%29&Leg4ii=BEV&Leg4Date=22%2F02%2F2019&TotalPassengers=1&TotalVehicles=0")
soup = bs4.BeautifulSoup(html_doc.text, 'html.parser')
headers=soup.find_all('h3' , id=re.compile("Leg1"))

for h in headers:
  spans=h.find_all('span')
  for span in spans:
      print(span.text)
票数 1
EN

Stack Overflow用户

发布于 2019-02-22 16:17:08

主要的问题是在html部分的表中只有第一项。其他项目在javascript中。因此,您需要像使用Kajal那样使用request,或者使用Selenium

Selenium代码:

代码语言:javascript
复制
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")

driver=webdriver.Chrome(chrome_options=options, executable_path=r'your path')
driver.get('https://alilauronew.forth-crs.gr/italian_b2c/npgres.exe?func=TT&tripcount=1&StartDateLeg1=22%2F02%2F2019&StartDateLeg2=22%2F02%2F2019&StartDateLeg3=22%2F02%2F2019&StartDateLeg4=22%2F02%2F2019&Leg1ilabel=NAPOLI%28BEVERELLO%29&Leg1i=BEV&Leg1iilabel=ISCHIA&Leg1ii=ISH&Leg1Date=22%2F02%2F2019&Leg2ilabel=ISCHIA&Leg2i=ISH&Leg2iilabel=NAPOLI%28BEVERELLO%29&Leg2ii=BEV&Leg2Date=22%2F02%2F2019&Leg3ilabel=NAPOLI%28BEVERELLO%29&Leg3i=BEV&Leg3iilabel=FORIO&Leg3ii=FRD&Leg3Date=22%2F02%2F2019&Leg4ilabel=FORIO&Leg4i=FRD&Leg4iilabel=NAPOLI%28BEVERELLO%29&Leg4ii=BEV&Leg4Date=22%2F02%2F2019&TotalPassengers=1&TotalVehicles=0'
)


x = driver.find_elements_by_css_selector("div#tabs-4")
alilauro_timetable = []
for div in x:
            print div.text

driver.close()

输出:

代码语言:javascript
复制
| | Ven 22 Feb 2019, 07:05 | NAPOLI(BEVERELLO) - ISCHIA | ALILAURO | AIRONE JET| Ven 22 Feb 2019, 08:05
| | Ven 22 Feb 2019, 07:35 | NAPOLI(BEVERELLO) - ISCHIA | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 08:35
| | Ven 22 Feb 2019, 09:40 | NAPOLI(BEVERELLO) - ISCHIA | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 10:40
| | Ven 22 Feb 2019, 10:50 | NAPOLI(BEVERELLO) - ISCHIA | ALILAURO | AIRONE JET | Ven 22 Feb 2019, 11:50
| | Ven 22 Feb 2019, 12:55 | NAPOLI(BEVERELLO) - ISCHIA | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 13:55
| | Ven 22 Feb 2019, 14:35 | NAPOLI(BEVERELLO) - ISCHIA | ALILAURO | NETTUNO JET | Ven 22 Feb 2019, 15:35
| | Ven 22 Feb 2019, 15:35 | NAPOLI(BEVERELLO) - ISCHIA | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 16:35
| | Ven 22 Feb 2019, 17:55 | NAPOLI(BEVERELLO) - ISCHIA | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 18:55
| | Ven 22 Feb 2019, 20:20 | NAPOLI(BEVERELLO) - ISCHIA | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 21:20
| | Ven 22 Feb 2019, 06:30 | ISCHIA - NAPOLI(BEVERELLO) | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 07:30
| | Ven 22 Feb 2019, 07:10 | ISCHIA - NAPOLI(BEVERELLO) | ALILAURO | NETTUNO JET | Ven 22 Feb 2019, 08:10
| | Ven 22 Feb 2019, 08:40 | ISCHIA - NAPOLI(BEVERELLO) | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 09:40
| | Ven 22 Feb 2019, 09:35 | ISCHIA - NAPOLI(BEVERELLO) | ALILAURO | AIRONE JET | Ven 22 Feb 2019, 10:35
| | Ven 22 Feb 2019, 11:45 | ISCHIA - NAPOLI(BEVERELLO) | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 12:45
| | Ven 22 Feb 2019, 13:20 | ISCHIA - NAPOLI(BEVERELLO) | ALILAURO | AIRONE JET | Ven 22 Feb 2019, 14:20
| | Ven 22 Feb 2019, 14:05 | ISCHIA - NAPOLI(BEVERELLO) | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 15:05
| | Ven 22 Feb 2019, 16:15 | ISCHIA - NAPOLI(BEVERELLO) | ALILAURO | NETTUNO JET | Ven 22 Feb 2019, 17:15
| | Ven 22 Feb 2019, 16:50 | ISCHIA - NAPOLI(BEVERELLO) | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 17:50
| | Ven 22 Feb 2019, 19:10 | ISCHIA - NAPOLI(BEVERELLO) | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 20:10
| | Ven 22 Feb 2019, 07:05 | NAPOLI(BEVERELLO) - FORIO | ALILAURO | AIRONE JET | Ven 22 Feb 2019, 08:30
| | Ven 22 Feb 2019, 09:40 | NAPOLI(BEVERELLO) - FORIO | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 11:05
| | Ven 22 Feb 2019, 10:50 | NAPOLI(BEVERELLO) - FORIO | ALILAURO | AIRONE JET | Ven 22 Feb 2019, 12:15
| | Ven 22 Feb 2019, 14:35 | NAPOLI(BEVERELLO) - FORIO | ALILAURO | NETTUNO JET | Ven 22 Feb 2019, 16:00
| | Ven 22 Feb 2019, 17:20 | NAPOLI(BEVERELLO) - FORIO | ALILAURO | NETTUNO JET | Ven 22 Feb 2019, 18:45
| | Ven 22 Feb 2019, 06:45 | FORIO - NAPOLI(BEVERELLO) | ALILAURO | NETTUNO JET | Ven 22 Feb 2019, 08:10
| | Ven 22 Feb 2019, 09:15 | FORIO - NAPOLI(BEVERELLO) | ALILAURO | AIRONE JET | Ven 22 Feb 2019, 10:35
| | Ven 22 Feb 2019, 11:20 | FORIO - NAPOLI(BEVERELLO) | ALILAURO | CELESTINA LAURO | Ven 22 Feb 2019, 12:45
| | Ven 22 Feb 2019, 13:00 | FORIO - NAPOLI(BEVERELLO) | ALILAURO | AIRONE JET | Ven 22 Feb 2019, 14:20
| | Ven 22 Feb 2019, 15:55 | FORIO - NAPOLI(BEVERELLO) | ALILAURO | NETTUNO JET | Ven 22 Feb 2019, 17:15
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/54830177

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档