问从span标记中拉出href
EN

Stack Overflow用户

提问于 2020-03-08 06:22:50

回答 1查看 27关注 0票数 1

我不知道如何正确地从下面的代码块中提取href，特别是ID (hillge01，masonfr01)。

<div>
  <strong>Inactive: </strong>
  <span><strong>MIL</strong></span>
  <a href="/players/h/hillge01.html">George Hill</a>,
  <a href="/players/m/masonfr01.html">Frank Mason</a>,
  <a href="/players/r/reynoca01.html">Cameron Reynolds</a>,
  <a href="/players/w/wilsodj01.html">D.J. Wilson</a>

  <span><strong>LAL</strong> </span>
  <a href="/players/a/antetko01.html">Kostas Antetokounmpo</a>,
  <a href="/players/c/cacokde01.html">Devontae Cacok</a>,
  <a href="/players/h/hortota01.html">Talen Horton-Tucker</a>,
  <a href="/players/w/waitedi01.html">Dion Waiters</a>
</div>

到目前为止，我已经成功地使用下面的代码提取了第一个href，但还没有想出一种方法来返回其余的href。

soup = get_soup(date_team)

for strong_tag in soup.findAll('strong'):
   if 'Inactive' in strong_tag.text:
   str1 = strong_tag.next_sibling.next_sibling
   print(str1)

在这方面的任何帮助都将非常感谢。

python

beautifulsoup

回答 1

Stack Overflow用户

发布于 2020-03-08 12:59:22

尝尝这个。使用SimplifiedDoc的解决方案。

from simplified_scrapy import SimplifiedDoc
html = '''
<div>
  <strong>Inactive: </strong>
  <span><strong>MIL</strong> </span>
  <a href="/players/h/hillge01.html">George Hill</a>,
  <a href="/players/m/masonfr01.html">Frank Mason</a>, 
  <a href="/players/r/reynoca01.html">Cameron Reynolds</a>, 
  <a href="/players/w/wilsodj01.html">D.J. Wilson</a> 
  <span><strong>LAL</strong> </span>
  <a href="/players/a/antetko01.html">Kostas Antetokounmpo</a>, 
  <a href="/players/c/cacokde01.html">Devontae Cacok</a>,
  <a href="/players/h/hortota01.html">Talen Horton-Tucker</a>, 
  <a href="/players/w/waitedi01.html">Dion Waiters</a>
</div>
'''
doc = SimplifiedDoc(html)
strong = doc.getElementByText('Inactive',tag='strong')
next = strong.getNext('a')
print(next)
next = next.next
print(next)

结果：

{'href': '/players/h/hillge01.html', 'tag': 'a', 'html': 'George Hill'}
{'href': '/players/m/masonfr01.html', 'tag': 'a', 'html': 'Frank Mason'}

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/60582757

复制

相似问题

问从span标记中拉出href
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从span标记中拉出hrefEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从span标记中拉出href
EN