我的上一列(“详细信息”)包含有空格或制表符的信息。我如何将所有细节作为R中的一列来阅读?
输入
date month year type expenditure details
10/31/2017 Oct 2017 food 10.89 Point Of Sale Withdrawal 17203473997 TRESIDDER U2 STANFORD CAUS
10/31/2017 Oct 2017 uber 4.80 Point Of Sale Withdrawal 44519795098 UBER TRIP F73MU HELP.UBER.COMCAUS
10/31/2017 Oct 2017 instacart 99.00 Point Of Sale Withdrawal 44534891998 INSTACART SUBSCRIPTION HTTPSINSTACARCAUS
10/31/2017 Oct 2017 food 8.45 Point Of Sale Withdrawal 22607113998 CLARK CENTER-BIO-X CAF STANFORD CAUS
10/31/2017 Oct 2017 food 4.00 Point Of Sale Withdrawal 33426639883 COUPA CAFE Y2E2 STANFORD CAUS
10/31/2017 Oct 2017 uber 4.52 Point Of Sale Withdrawal 42029000200217 UBER TRIP QZ7W5 HELP.UB800-5928996 CAUS
10/30/2017 Oct 2017 food 5.85 Point Of Sale Withdrawal 44519950998 UBER TRIP HOOMP HELP.UBER.COMCAUS
10/30/2017 Oct 2017 uber 23.20 Point Of Sale Withdrawal 33428390882 COUPA CAFE LYTTON PALO ALTO CAUS
10/30/2017 Oct 2017 food 7.52 Point Of Sale Withdrawal 34530890882 PANDA EXPRESS #2035 STANFORD CAUS发布于 2017-11-13 04:40:14
如果是excel文件,您可以使用一个R包将该文件读入Excel。我使用您的示例数据并使用openxlsx打开它,它不需要java,它按需要读取details列。守则将是:
read.xlsx("file", detectDates = T)发布于 2017-11-13 08:26:57
我不是这方面的专家,但我找到了一种可行的方法。它使用readLines作为一个字符串从文件中读取每一行,然后拆分前五个单词( date、month、year、type和expenditure列),并将行的其余部分作为最后一列details使用。
raw_txt <- readLines("123.txt")
splitted <- lapply(strsplit(raw_txt[-1], " +"), function(x){
one_to_five <- x[1:5]
six <- paste0(x[5:length(x)], collapse=" ")
c(one_to_five, six)
})
# Bind together the elements in "splitted":
df <- do.call(rbind, splitted)
# Give the columns the right names:
colnames(df) <- strsplit(raw_txt[1], " +")[[1]]结果:
date month year type expenditure details
[1,] "10/31/2017" "Oct" "2017" "food" "10.89" "10.89 Point Of Sale Withdrawal 17203473997 TRESIDDER U2 STANFORD CAUS"
[2,] "10/31/2017" "Oct" "2017" "uber" "4.80" "4.80 Point Of Sale Withdrawal 44519795098 UBER TRIP F73MU HELP.UBER.COMCAUS"
[3,] "10/31/2017" "Oct" "2017" "instacart" "99.00" "99.00 Point Of Sale Withdrawal 44534891998 INSTACART SUBSCRIPTION HTTPSINSTACARCAUS"
[4,] "10/31/2017" "Oct" "2017" "food" "8.45" "8.45 Point Of Sale Withdrawal 22607113998 CLARK CENTER-BIO-X CAF STANFORD CAUS"
[5,] "10/31/2017" "Oct" "2017" "food" "4.00" "4.00 Point Of Sale Withdrawal 33426639883 COUPA CAFE Y2E2 STANFORD CAUS"
[6,] "10/31/2017" "Oct" "2017" "uber" "4.52" "4.52 Point Of Sale Withdrawal 42029000200217 UBER TRIP QZ7W5 HELP.UB800-5928996 CAUS"
[7,] "10/30/2017" "Oct" "2017" "food" "5.85" "5.85 Point Of Sale Withdrawal 44519950998 UBER TRIP HOOMP HELP.UBER.COMCAUS"
[8,] "10/30/2017" "Oct" "2017" "uber" "23.20" "23.20 Point Of Sale Withdrawal 33428390882 COUPA CAFE LYTTON PALO ALTO CAUS"
[9,] "10/30/2017" "Oct" "2017" "food" "7.52" "7.52 Point Of Sale Withdrawal 34530890882 PANDA EXPRESS #2035 STANFORD CAUS" https://stackoverflow.com/questions/47257001
复制相似问题