我有一吨谷歌地图的网址,并希望获得一个干净的地址,从网址的地理编码。最近我在getURL包中找到了RCurl (),这给我带来了大量的信息
图书馆(RCurl) getURL("https://maps.google.com/?q=loc%3A+%32%34%34%30+Seattle%2C+%39%38%31%31%36+WA+US")
但我真正感兴趣的是将地址片段定位到getURL()输出的前端:
...< meta content=\"loc:2440西雅图,98116 WA US - Google \“property=\”og:title\“>
更新:我刚刚意识到上面的URL地址是一个糟糕的例子,这里有一个不同的例子:
getURL("https://maps.google.com/?q=loc%3A+%31%30%30%35%36+Interlake+Ave+N+seattle+WA+US") ...< meta content=\"loc:10056 Interlake Ave N西雅图WA US - Google \“property=\”og:title\“>
有没有人对如何有效地解决这一问题有建议?我的助手,我是R的中级,希望你能帮我。谢谢!!
时间
发布于 2014-08-12 22:09:41
使用Google如下所示:
require(XML)
burl <- "http://maps.google.com/maps/api/geocode/xml?address="
address <- "2440 Seattle, 98116 WA US"
request <- paste0(burl,URLencode(address))
doc <- htmlTreeParse(request, useInternalNodes=TRUE)
# Interpreted Adress
xmlValue(doc[["//formatted_address"]])
[1] "2440, Seattle-Tacoma International Airport (SEA), Seattle, WA 98158, USA"编辑
如果只有编码的URL,请使用URLdecode对其进行解码,而不是下载URL:
URL <- "https://maps.google.com/?q=loc%3A+%32%34%34%30+Seattle%2C+%39%38%31%31%36+WA+US"
URL <- gsub(".*loc","",URL) # Get rid of https://...
URL <- URLdecode(URL)
gsub("[:]|[+]", " ", URL) # Get rid of ":" and "+"
[1] " 2440 Seattle, 98116 WA US"https://stackoverflow.com/questions/25274733
复制相似问题