首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >取消带有“`dbplyr`”和“`RPostgres`”的json列

取消带有“`dbplyr`”和“`RPostgres`”的json列
EN

Stack Overflow用户
提问于 2019-03-15 09:43:41
回答 1查看 499关注 0票数 2

在PostgreSQL DB上的表中有一个包含json字符串的列。

我想在服务器端取消它,并且我能够找到SQL代码来完成这个任务。我面临的挑战是,我希望能够将此操作插入到dbplyr管道链中,即更新tbl_lazy对象而不运行查询,而且我还不了解dbplyr的内部机制。

见下面的例子:

设置

代码语言:javascript
复制
library("RPostgres")
library("dplyr")

drv <- RPostgres::dbDriver("Postgres")
#### NOT REPRODUCIBLE!!! ####
con <- RPostgres::dbConnect(drv, dbname = mydbname, sslmode = 'require',
                            host = myhost, port = 5432,
                            user = user, password = mypassword)
#############################

my_tbl <- tribble(~a, ~bcd,
        1, '{"b": "foo1", "c": "bar1", "d": "baz1"}',
        2, '{"b": "foo2", "c": "bar2", "d": "baz2"}')

copy_to(con, my_tbl, "my_tbl",
        temporary = TRUE)

部分解决方案(无延迟计算)

代码语言:javascript
复制
unnest_json <-function(data, json_col, ...){
  # build character vector whose names are cols to be created and values columns
  # to be extracted 
  dots <- sapply(as.list(substitute(list(...)))[-1], as.character)
  json_col <- as.character(substitute(json_col))
  # json extraction string
  query0  <- sprintf("%s::json->'%s' as %s",json_col, dots, names(dots))
  # complete query
  query <- sprintf("SELECT *, %s FROM (%s) AS PREV", 
                   paste(query0, collapse = ", "), 
                   dbplyr::sql_render(data))
  # fetch (when I'd rather update the tbl_lazy object instead)
  dbGetQuery(data$src$con, query)
}

con %>%
  tbl("my_tbl") %>%
  unnest_json(bcd, unnested_b = "b", unnested_c = "c")
#   a                                     bcd unnested_b unnested_c
# 1 1 {"b": "foo1", "c": "bar1", "d": "baz1"}     "foo1"     "bar1"
# 2 2 {"b": "foo2", "c": "bar2", "d": "baz2"}     "foo2"     "bar2"

所需的特性

我希望能够这样做,例如:

代码语言:javascript
复制
con %>%
  tbl("my_tbl") %>%
  unnest_json(bcd, unnested_b = "b", unnested_c = "c") %>% # not evaluated at this point 
  select(-bcd) %>%
  head(1) %>%
  collect() 

#   a unnested_b unnested_c
# 1 1     "foo1"     "bar1"
EN

回答 1

Stack Overflow用户

发布于 2019-03-15 10:26:00

诀窍是将函数sql用于对tbl的调用。

应该向sql提供包含查询的字符串。

因此,功能变成:

代码语言:javascript
复制
unnest_json <-function(.data,.json_col, ...){
  # build character vector whose names are cols to be created and values columns
  # to be extracted 
  dots <- sapply(as.list(substitute(list(...)))[-1], as.character)
  .json_col <- as.character(substitute(.json_col))
  query0  <- sprintf("%s::json ->>'%s' as %s", .json_col, dots, names(dots))
  query <- sprintf("SELECT *, %s FROM (%s) AS PREV", 
                   paste(query0, collapse = ", "), 
                   dbplyr::sql_render(.data))
  tbl(.data$src$con, sql(query))
}

我还在查询中将->更改为->>,以获得正确的输出。

unnest_json的输出:

代码语言:javascript
复制
con %>%
  tbl("my_tbl") %>%
  unnest_json(bcd, unnested_b = "b", unnested_c = "c")
# # Source:   SQL [?? x 4]
# # Database: postgres [standtasic@adbsg@adbsg.postgres.database.azure.com:5432/standtasicdb]
#       a bcd                                                   unnested_b unnested_c
#   <dbl> <chr>                                                 <chr>      <chr>     
# 1     1 "{\"b\": \"foo1\", \"c\": \"bar1\", \"d\": \"baz1\"}" foo1       bar1      
# 2     2 "{\"b\": \"foo1\", \"c\": \"bar1\", \"d\": \"baz1\"}" foo1       bar1 

用于dbplyr链并收集:

代码语言:javascript
复制
con %>%
  tbl("my_tbl") %>%
  unnest_json(bcd, unnested_b = "b", unnested_c = "c") %>%
  select(-bcd) %>%
  head(1) %>%
  collect()
# # A tibble: 1 x 3
#         a unnested_b unnested_c
#     <dbl> <chr>      <chr>
#   1     1 foo1       bar1 
票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/55179668

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档