我正试着从这一页中找出赌博公司的赔率:
https://www.interwetten.com/en/sportsbook/top-leagues?topLinkId=1
因此,到目前为止,我编写了以下代码
interwetten <- read_html("https://www.interwetten.com/en/sportsbook/top-leagues?topLinkId=1")
bundesliga <- html_nodes(interwetten, xpath = '//*[@id="TBL_Content_1019"]')
bundesliga_teams <- html_nodes(bundesliga, "span")现在我得到的输出是:
[1] <span id="ctl00_cphMain_UCOffer_LeagueList_rptLeague_ctl00_ucBettingContainer_lblClose" clas ...
[2] <span itemscope="itemscope" itemprop="location" itemtype="http://schema.org/Place"><meta ite ...
[3] <span itemprop="name">VfB Stuttgart</span>
[4] <span>X</span>现在,我想在每个 <span itemprop="name"></span>中提取团队名称,但我不知道如何提取它。我试图使用节点或吸引,但它没有工作。
发布于 2017-10-09 15:30:06
您可以使XPath选择器更加具体,然后使用html_text。
library(rvest)
interwetten <- 'https://www.interwetten.com/en/sportsbook/top-leagues?topLinkId=1' %>%
read_html()
teams <- interwetten %>%
html_nodes(xpath = '//*[@id="TBL_Content_1019"]//span[@itemprop="name"]') %>%
html_text()
teams
#> [1] "VfB Stuttgart" "1. FC Cologne" "Mainz 05"
#> [4] "Hamburger SV" "Hertha BSC" "Schalke 04"
#> [7] "Hannover 96" "Frankfurt" "Hoffenheim"
#> [10] "Augsburg" "Bayern Munich" "Freiburg"
#> [13] "Dortmund" "RB Leipzig" "Leverkusen"
#> [16] "Wolfsburg" "Werder Bremen" "Monchengladbach"https://stackoverflow.com/questions/46649623
复制相似问题