文章/答案/技术大牛

发布

社区首页 >问答首页 >在R中有一种连接csv文件的折线的方法吗？

问在R中有一种连接csv文件的折线的方法吗？
EN

Stack Overflow用户

提问于 2016-07-27 05:01:21

回答 2查看 324关注 0票数 1

我有一个程序可以导出csv文件，但不引用新行，也不使用/n而不是/n/r。它在记录中间使用与在末尾相同的行尾。但是，程序在变量之间使用逗号分隔符。我如何告诉R删除所有的eol标记，直到达到数据中的变量数？

我的数据应该是这样的：

name, rank, serial number, age, height, weight

mike, noob, 123456, 22, 6, 34.4

bob, officer, 345

323, 24, 6, 2

3.5

ted, officer, 34234, 2

5, 6, 35.2

我如何基本上删除在第2行的5之后、第3行的2之后和第6行的2之后的CR？每一行应该有5个逗号和6个变量。我的数据在每一行之间没有额外的行。我就是不能让它停止把它放在一条线上而不去做。我的数据是43个变量，并不断生成新的行。大部分时间，它是在阅读，有几千行。其中约20%有CR问题。

还想添加一个新行总是在一个新行上开始，如果这是有意义的话，它将不会跟随在前面的行上。

数据框架应该如下所示：

name, rank, serial number, age, height, weight

mike, noob, 123456, 22, 6, 34.4

bob, officer, 345323, 24, 6, 23.5

ted, officer, 34234, 25, 6, 35.2

这就是我的数据，如果这有帮助的话。第一行是标题，后面应该是6条记录，但是read.csv和fread以及我尝试过的其他所有内容都给了我10条记录。第六张唱片有额外的CR，但仍然有42个变量。只是分成五条线。

EFPCName,EFUseAPPE,log pdl,pdl error,device pretty name,num pages,num sheets,copies printed,total pages printed,total sheets printed,total color pages printed,total bw pages printed,total tab pages printed,total sample pages printed,num copies,print status,instructions,notes1,notes2,username,noneutf8lastuser,non utf8 submitted by,title,size,logical printer,fiery,time,date,total rip duration,timestamp spooling,timestamp done spooling,timestamp waiting to rip,timestamp ripping,timestamp done ripping,timestamp waiting to print,timestamp printing,timestamp done printing,media weight,input slot,media size,media type,interpreter,

LZX  Laser 24 - 11 x 17 Tabloid,,postscript,,Canon,2,1,1,2,1,1,1,0,0,1,OK,,,,TeamMember,,TeamMember,78053.01.pdf,4004491,Canon hold,SERVER-Shredder,2013 06 07 19 37 13,2013 06 07 19 37 00,3,2013 06 07 19 37 23,2013 06 07 19 37 24,2013 06 07 19 38 02 118342,2013 06 07 19 38 02 118342,2013 06 07 19 38 09,2013 06 07 19 38 09,2013 06 07 19 38 38,2013 06 07 19 39 19 124419,,Tray5,Tabloid,Plain,PS,

LZX  Laser 24 - 11 x 17 Tabloid,,postscript,,Canon,2,1,1,2,1,1,1,0,0,1,OK,,,,TeamMember,,TeamMember,78053.01.pdf,4004520,none,SERVER-Shredder,2013 06 07 19 37 13,2013 06 07 19 37 00,,2013 06 07 19 44 07 926090,2013 06 07 19 44 07 926744,2013 06 07 19 44 07 926090,2013 06 07 19 44 07 926090,2013 06 07 19 44 07 926744,2013 06 07 19 44 07,2013 06 07 19 44 11,2013 06 07 19 44 53 141084,,Tray5,Tabloid,Plain,PS,

LZX  Laser 24 - 11 x 17 Tabloid,,postscript,,Canon,2,1,1,2,1,1,1,0,0,1,OK,,,,TeamMember,,TeamMember,78053.01.pdf,4004520,none,SERVER-Shredder,2013 06 07 19 37 13,2013 06 07 19 37 00,,2013 06 07 19 46 01 550964,2013 06 07 19 46 01 551451,2013 06 07 19 46 01 550964,2013 06 07 19 46 01 550964,2013 06 07 19 46 01 551451,2013 06 07 19 46 01,2013 06 07 19 46 05,2013 06 07 19 46 46 911557,,Tray5,Tabloid,Plain,PS,

LZX80  Color Copy Cover - 11 x 17 Tabloid,,postscript,,Canon,1,2,2,2,2,2,0,0,0,2,OK,,,,TeamMember,,TeamMember,78011.01.pdf,874486,Canon hold,SERVER-Shredder,2013 06 07 19 47 07,2013 06 07 19 47 00,3,2013 06 07 19 47 17,2013 06 07 19 47 17 507576,2013 06 07 19 47 47 960542,2013 06 07 19 47 47 960542,2013 06 07 19 47 51,2013 06 07 19 47 51,2013 06 07 19 47 54,2013 06 07 19 48 25 77595,,Tray3,Tabloid,Heavy5,PS,

LZX  Laser 24 - 11 x 17 Tabloid,,postscript,,Canon,2,1,1,2,1,1,1,0,0,1,OK,,,,TeamMember,,TeamMember,78053.01.pdf,4004520,none,SERVER-Shredder,2013 06 07 19 37 13,2013 06 07 19 37 00,,2013 06 07 19 48 04 501212,2013 06 07 19 48 04 502522,2013 06 07 19 48 04 501212,2013 06 07 19 48 04 501212,2013 06 07 19 48 04 502522,2013 06 07 19 48 04,2013 06 07 19 48 07,2013 06 07 19 48 48 188474,,Tray5,Tabloid,Plain,PS,

EX32  Laser 32 - 11 x 17 Tabloid,,pdf,,Canon,63,64,1,63,64,4,59,0,0,1,OK,Size: 11 x 17
Finishing: Coil Binding  Cutting  Punching
Pages: 
1-63  4/0  EX32  Laser 32 - 11 x 17  11 x 17
 ,Color 77992:01Employee Handbook REVISED_2up(NFC).pdf, McAllen TX,EFI Pace,,,Color 77992:01Employee Handbook REVISED_2up(NFC).pdf,518880,none,SERVER-Shredder,2013 06 07 20 01 52,2013 06 07 20 01 00,3,2013 06 07 20 02 41 495216,2013 06 07 20 02 44 780196,2013 06 07 20 02 41 871208,2013 06 07 20 02 41 871208,2013 06 07 20 02 45,2013 06 07 20 02 45,2013 06 07 20 03 25,2013 06 07 20 05 45 741386,,Tray4,Tabloid,Heavy1,PS,

csv

malformed

回答 2

Stack Overflow用户

回答已采纳

发布于 2016-07-27 07:35:59

这就是我现在拥有的。看看你的数据是如何工作的。

dat <- readLines("temp.txt") # read whatever is in there, one line at a time
varnames <- unlist(strsplit(dat[1], ",")) # extract variable names
nvar <- length(varnames)

k <- 1 # setting up a counter
dat1 <- matrix(NA, ncol = nvar, dimnames = list(NULL, varnames))

while(k <= length(dat)){
    k <- k + 1
    if(dat[k] == "") {k <- k + 1
        print(paste("data line", k, "is an empty string"))
        if(k > length(dat)) {break}
    }
    temp <- dat[k]
    # checks if there are enough commas or if the line was broken
    while(length(gregexpr(",", temp)[[1]]) < nvar-1){
        k <- k + 1
        temp <- paste0(temp, dat[k])
    }
    temp <- unlist(strsplit(temp, ","))
    message(k)
    dat1 <- rbind(dat1, temp)
}

dat1 = dat1[-1,] # delete the empty initial row

总的想法是继续折叠文本，直到字符串中有足够的逗号为止。一旦实现了这一点，数据就会在逗号处拆分，并作为一行添加到矩阵中。代码非常笨重，对于大数据文件来说会很慢。不过，这是我能做的最好的了。

对于原始数据示例，代码工作并创建一个包含42列6行的字符矩阵。对于较小的示例，代码无法处理最后一列中的中断。

票数 1

Stack Overflow用户

发布于 2016-07-27 05:11:08

如果要在具有不等长度行时隐式添加空白字段，请在read.table调用中设置fill = TRUE。

如果这不是你要问的问题，你能更清楚地提供一个可复制的例子吗？

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/38604079

复制

相似问题

问在R中有一种连接csv文件的折线的方法吗？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在R中有一种连接csv文件的折线的方法吗？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在R中有一种连接csv文件的折线的方法吗？
EN