文章/答案/技术大牛

发布

社区首页 >问答首页 >带有列名的read.fwf()和R中的两种sep

问带有列名的read.fwf()和R中的两种sep
EN

Stack Overflow用户

提问于 2014-04-30 07:44:20

回答 1查看 1.5K关注 0票数 1

我和This issue也有类似的问题

我的数据类似于( txt文件中没有第一行)

----*----1----*----2----*---
Region                 Value
New York, NY        66,834.6
Kings, NY           34,722.9
Bronx, NY           31,729.8
Queens, NY          20,453.0
San Francisco, CA   16,526.2
Hudson, NJ          12,956.9
Suffolk, MA         11,691.6
Philadelphia, PA    11,241.1
Washington, DC       9,378.0
Alexandria IC, VA    8,552.2

我的尝试是

#fwf data2
path <- "fwfdata2.txt"
data6 <- read.fwf(path, 
            widths=c(17, -3, 8), 
            header=TRUE,
            #sep=""
            as.is=FALSE)
data6

有答案

> data6
                  Region.................Value
New York, NY                          66,834.6
Kings, NY                             34,722.9
Bronx, NY                             31,729.8
Queens, NY                            20,453.0
San Francisco, CA                     16,526.2
Hudson, NJ                            12,956.9
Suffolk, MA                           11,691.6
Philadelphia, PA                      11,241.1
Washington, DC                         9,378.0
Alexandria IC, VA                      8,552.2
> dim(data6)
[1] 10  1

所以问题是，当我的数据被"，“和"”分隔开时。当我添加sep="“时，它将生成如下错误。

Error in read.table(file = FILE, header = header, sep = sep, row.names = row.names,  : 
  more columns than column names

回答 1

Stack Overflow用户

回答已采纳

发布于 2014-04-30 07:56:30

我认为您的问题是，read.fwf期望头是九月分隔的，而数据是固定的宽度：

header: a logical value indicating whether the file contains the
        names of the variables as its first line.  If present, the
        names must be delimited by ‘sep’.

   sep: character; the separator used internally; should be a
        character that does not occur in the file (except in the
        header).

我跳过头来读取数据，然后只读取第一行：

> data = read.fwf(path,widths=c(17,-3,8),head=FALSE,skip=1,as.is=TRUE)
> heads = read.fwf(path,widths=c(17,-3,8),head=FALSE,n=1,as.is=TRUE)
> names(data)=heads[1,]
> data
   Region               Value
1  New York, NY      66,834.6
2  Kings, NY         34,722.9
3  Bronx, NY         31,729.8
4  Queens, NY        20,453.0
5  San Francisco, CA 16,526.2
6  Hudson, NJ        12,956.9
7  Suffolk, MA       11,691.6
8  Philadelphia, PA  11,241.1
9  Washington, DC     9,378.0
10 Alexandria IC, VA  8,552.2

如果您希望Region作为一个因素，那么在读取数据时使用as.is=FALSE (在您的示例中)，但是在读取标头时必须使用as.is=TRUE，否则它将被转换为数字。

您是否还想将区域拆分为逗号分隔的部分，并将逗号分隔的数字转换为数值？你没说。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/23382405

复制

相似问题

问带有列名的read.fwf()和R中的两种sep
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问带有列名的read.fwf()和R中的两种sepEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问带有列名的read.fwf()和R中的两种sep
EN