我正在ndjson文件(~1Gb)中读取大ID。in大约是19位数,在流中时会丢失精度。最后的4-5位数字不同.我怎么才能避免这种情况?谢谢!
library(jsonlite)
data_out <- data.frame(userID = c(1123581321345589000, 3141592653589793000, 2718281828459045000),
variable = c("a", "b", "c"))
con_out <- file("test_output.json", open = "wb")
jsonlite::stream_out(data_out, con_out, auto_unbox = T)
close(con_out)
con_in <- file("test_output.json")
data_in <- jsonlite::stream_in(con_in)
> format(data_in$userID, scientific = F)
[1] "1123581321345590016" "3141592653589790208" "2718281828459039744"编辑:我没有控制输入文件或其格式。如果在编辑器中打开输入文件,则I是正确的。“错误”发生在流进来时。
发布于 2022-06-01 08:53:57
您可以将userID转换为character
library(jsonlite)
data_out <- data.frame(userID = c(1123581321345589000, 3141592653589793000, 2718281828459045000),
variable = c("a", "b", "c"))
# Convert to character
data_out$userID <- as.character(data_out$userID)
con_out <- file("test_output.json", open = "wb")
jsonlite::stream_out(data_out, con_out, auto_unbox = T)
#> Complete! Processed total of 3 rows.
close(con_out)
con_in <- file("test_output.json")
data_in <- jsonlite::stream_in(con_in)
#> opening file input connection.
#> Found 3 records... Imported 3 records. Simplifying...
#> closing file input connection.
identical(data_in,data_out)
#> [1] TRUEhttps://stackoverflow.com/questions/72458905
复制相似问题