我正在做一个项目,在这个项目中,我需要得到最短的距离和时间,从“拾取”到“放下”坐标。在我的数据集中,我有一个变量表示"trip_distance“和"pickup_date",我的任务是计算"trip_distance”变量偏离谷歌估计距离的程度,并通过控制出发时间来计算每次旅行所需的时间。
下面是我的数据的一个小示例(大约有150万行,我正试图为<2500查询限制找到一种方法)
trip_distance pickup_datetime pickup dropoff
1 8.1 2011-01-01 23:13:56 40.77419%2C-73.872608 40.78055%2C-73.955042
2 10.6 2011-01-04 17:12:49 40.7737%2C-73.870721 40.757007%2C-73.971953
3 15.9 2011-01-05 18:41:53 40.773761%2C-73.87086 40.707277%2C-74.007301代码:
library(ggmap)
rownames(X) <- NULL
res <- mapdist(from= X$pickup,
to = X$dropoff,
mode = "driving" ,
output = "simple", messaging = FALSE, sensor = FALSE,
language = "en-EN", override_limit = FALSE, departure_time= X$pickup_date)我得到的错误是:
Error in mapdist(from = X$pickup, to = X$dropoff, mode = "driving", output = "simple", : unused argument (departure_time = X$pickup_date)有没有办法用地图来控制流量?
dput(头(X))
structure(list(pickup_datetime = structure(c(1293923636, 1294161169,
1294252913, 1294259376, 1294419723, 1293903309), class = c("POSIXct",
"POSIXt"), tzone = ""), trip_distance = c(8.1, 10.6, 15.9, 8.9,
11.5, 9.6), pickup = c("40.77419,-73.872608", "40.7737,-73.870721",
"40.773761,-73.87086", "40.773776,-73.870908", "40.774161,-73.87302",
"40.774135,-73.8749"), dropoff = c("40.78055,-73.955042", "40.757007,-73.971953",
"40.707277,-74.007301", "40.770568,-73.95468", "40.758284,-73.986621",
"40.758691,-73.961359")), .Names = c("pickup_datetime", "trip_distance",
"pickup", "dropoff"), row.names = c(NA, 6L), class = "data.frame")发布于 2016-06-24 22:35:41
我编写了一个包googleway来访问google,您可以在其中指定API键,从而使用api提供的函数(例如出发时间和流量)。
但是,为了达到这个目的,您需要使用开发版本,因为我注意到traffic_model中有一个小错误。这将在下一个版本中得到解决。
devtools::install_github("SymbolixAU/googleway")
library(googleway)
key <- "your_api_key"
## data.frame of origin & destination coordiantes
## you can obviously add in a 'pickup' datetime column too,
## but remembering that for Google API it must be in the future
df <- data.frame(orig_lat = c(40.77419, 40.7737, 40.773761),
orig_lon = c(-73.872608, -73.870721, -73.87086),
dest_lat = c(40.78055, 40.757007, 70.707277),
dest_lon = c(-73.955042, -73.971953,-74.007301))现在,您可以使用您喜欢的循环方法来获取每一行上每组点之间的距离( data.frame )。
例如
lst <- apply(df, 1, function(x) {
google_distance(origins = list(c(x["orig_lat"], x["orig_lon"])),
destinations = list(c(x["dest_lat"], x["dest_lon"])),
departure_time = Sys.time() + (24 * 60 * 60),
traffic_model = "best_guess",
key = key)
})然后可以从返回的列表中访问数据。
lst[[1]]$origin_addresses
# [1] "Central Terminal Dr, East Elmhurst, NY 11371, USA"
lst[[1]]$destination_addresses
# [1] "1294-1296 Lexington Ave, New York, NY 10128, USA"
lst[[1]]$rows$elements
# [[1]]
# distance.text distance.value duration.text duration.value duration_in_traffic.text duration_in_traffic.value status
# 1 12.8 km 12805 21 mins 1278 23 mins 1355 OK发布于 2016-05-11 20:20:46
来自mapdist()的ggmap函数不会返回流量信息,因为它似乎不使用&departure_time=和key=参数构造&departure_time=和key=参数(检索交通信息是必须的)。
正如Google距离矩阵API 文档中提到的那样
对于旅行模式驱动的请求:您可以指定
departure_time来接收考虑到交通状况的路线和行程持续时间(响应字段:duration_in_traffic)。只有当请求包含有效的API密钥或有效的Google高级计划客户端ID和签名时,此选项才可用。
而且,在您的数据集中,pickup_date是过去的,所以您不能使用它作为departure_time参数。
departure_time必须设置为当前时间或将来的某个时间。它不可能在过去。
并且需要一个数字格式:
您可以将时间指定为整数,以秒为单位,从1970年1月1日午夜开始。或者,您可以指定一个
now值,它将出发时间设置为当前时间(更正为最近的秒)。
尽管如此,您可以使用所需的参数手动构建自己对Google距离矩阵API的请求(注意,我修改了初始数据集以提供将来出现的pickup_datetime )。
APIKEY = ##Your API key goes here##
url_string <- paste0("https://maps.googleapis.com/maps/api/distancematrix/json",
"?origins=", df$pickup,
"&destinations=", df$dropoff,
# convert POSIXct to numeric
"&departure_time=", as.numeric(df$pickup_datetime),
"&traffic_model=best_guess",
"&key=", APIKEY)这将给您一个包含所有URL的字符向量url_string。例如,您可以检索第一个条目的信息:
connect <- url(url_string[1])
tree <- jsonlite::fromJSON(paste(readLines(connect), collapse = ""),
simplifyDataFrame = FALSE)然后使用以下方法访问流量信息:
tree$rows[[1]]$elements[[1]]$duration_in_traffic这意味着:
$text
[1] "17 mins"
$value
[1] 1016数据
df <- structure(list(pickup_datetime = structure(c(1473923636, 1474161169,
1474252913, 1474259376, 1474419723, 1473903309), class = c("POSIXct",
"POSIXt")), trip_distance = c(8.1, 10.6, 15.9, 8.9, 11.5, 9.6
), pickup = c("40.77419,-73.872608", "40.7737,-73.870721", "40.773761,-73.87086",
"40.773776,-73.870908", "40.774161,-73.87302", "40.774135,-73.8749"
), dropoff = c("40.78055,-73.955042", "40.757007,-73.971953",
"40.707277,-74.007301", "40.770568,-73.95468", "40.758284,-73.986621",
"40.758691,-73.961359")), class = "data.frame", .Names = c("pickup_datetime",
"trip_distance", "pickup", "dropoff"), row.names = c(NA, -6L))https://stackoverflow.com/questions/37167580
复制相似问题