网站https://www.moe.gov.sg/schoolfinder/schooldetail?schoolname=ZHONGHUA-SECONDARY-SCHOOL
我只想在DSA talent areas offered in 2021下提取信息
但是,当我使用选择器小工具时,获取路径.is--open:nth-child(4) .moe-collapsible__content
dsa <- html_node(listpage,".is--open:nth-child(4) .moe-collapsible__content") %>% html_text() %>% unlist()
dsa输出为NA
是否有任何方法从可折叠内容中获取信息?
发布于 2022-03-14 12:52:39
一种方法是,
library(rvest)
library(dplyr)
library(stringr)
'https://www.moe.gov.sg/schoolfinder/schooldetail?schoolname=ZHONGHUA-SECONDARY-SCHOOL' %>%
read_html() %>% html_nodes('.moe-collapsible__content') %>% html_nodes('.moe-list') %>% html_text() %>% nth(3) %>% str_split('\n')
[[1]]
[1] "Leadership and Character (Girls and Boys)\r"
[2] " Chinese Orchestra (Girls and Boys)\r"
[3] " Choir (Girls and Boys)\r"
[4] " Concert Band (Girls and Boys)\r"
[5] " Guzheng Ensemble (Girls and Boys)\r"
[6] " Badminton (Girls)\r"
[7] " Basketball (Girls)\r"
[8] " Table Tennis (Boys)\r"
[9] " Volleyball (Boys)\r" 发布于 2022-03-14 13:20:50
您可以更精确地使用:contains类来针对正确的父div,然后使用子代选择器移动到子li元素。通过使用部分字符串,您可能可以为2022年提供一些未来的校对。
library(magrittr)
library(rvest)
read_html("https://www.moe.gov.sg/schoolfinder/schooldetail?schoolname=ZHONGHUA-SECONDARY-SCHOOL") %>%
html_elements('.moe-collapsible:contains("DSA talent areas") li') %>% html_text()https://stackoverflow.com/questions/71467602
复制相似问题