我想
。
library(rvest); library(XML); library(dplyr);library(tidyr); library(purrr); library(rjson)
url <- "http://suggestqueries.google.com/complete/search?client=chrome&q=Nike"
nike_autocomplete <- read_html(url)输出应该如下所示:
[1] "Nike" "nike air" "nike air max" "nike schuhe" "nike air force" "nike air max 97"
[7] "nike tn" "nike id" "nike air max 270" "nike vapormax" "nike pullover" "nike schweiz"
[13] "nike 97" "nike off white" "nike air max plus" "nike winterschuhe" "nike schuhe damen" "nike huarache"
[19] "nike shoes" "nike logo" "nike air max 90"因此,在最后没有空的东西。
发布于 2019-02-05 21:28:29
您要提取的文本是JSON格式的,所以使用json读取实用程序比使用regex要好得多。为此,我喜欢jsonlite::fromJSON。
library(rvest)
library(jsonlite)
library(purrr)
url <- "http://suggestqueries.google.com/complete/search?client=chrome&q=Nike"
read_html(url) %>%
xml_text %>%
fromJSON() %>%
extract(1:2) %>%
unlist
# [1] "Nike" "nike shox" "nike shoes" "nike air max"
# [5] "nike outlet" "nike air force 1" "nike basketball shoes" "nike vapormax"
# [9] "nike air max 97" "nike id" "nike store" "nike stock"
# [13] "nike air max 270" "nike promo code" "nike windbreaker" "nike sweatshirts"
# [17] "nike huarache" "nike hoodie" "nike cortez" "nike sweatpants"
# [21] "nike slides" https://stackoverflow.com/questions/54541784
复制相似问题