我经常面对csv文件,这些文件是用德国语言环境保存的,因此没有正确的逗号分隔,而是用分号分隔。当然,这很容易通过定义分隔符来解决。但是,与vroom不同的是,fread不提供也定义十进制分隔符的可能性。因此,以,作为小数分隔符的数值被导入为字符,或者错误地没有任何小数分隔符,因此实际上是大数。是否有一种类似于fread中十进制分隔符工作方式的直接定义方法?
library(vroom)
library(data.table)
df <- data.table(row.num = 1:10
, V1 = rnorm(10,10,5)
, V2 = rnorm(10,100,30))
fwrite(df, file = "vroom_test.csv", sep = ";", dec = ",")
fread(input = "vroom_test.csv", sep = ";", dec = ",")
vroom(file = "vroom_test.csv", delim = ";")
# definition of custom locale does allow that
vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))发布于 2022-02-02 16:04:37
正如注释中已经提到的,解决方案相当直截了当,唯一需要的就是将locale()选项包含到vroom调用中。locale选项的可能选项可以在其文档中找到。
vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))https://stackoverflow.com/questions/70957897
复制相似问题