我试图使用read.table函数read.table("https://www.fhwa.dot.gov/bridge/nbi/2020/AL20.txt", )将这个数据集读入r中,我得到了这个错误。
扫描中的错误(文件=文件,什么=什么,sep =9月,引号=引号,dec = dec,:第1行没有22个元素
我是否使用正确的函数读取这些数据?如果我是,我该怎么解决这个问题?指向试图读取的数据集的链接是https://www.fhwa.dot.gov/bridge/nbi/2020/AL20.txt。
发布于 2021-09-26 01:37:29
该数据不以空格、制表符、逗号或分号分隔.它似乎是固定宽度的格式。为此,您必须知道或推断列的宽度。我已经猜到了基于几行,但您需要验证的是,事情并没有不必要的分裂或保持在一起。
此外,第10行有一个嵌入的#字符,因此必须禁止使用comment.char="#"的许多R解析函数的默认行为,否则将得到Error: line 10 did not have 24 elements。
尝尝这个,
widths <- diff(c(0,9,18,37,62,80,105,117,127,277,295,298,301,305,309,313,331,338,355,359,373,385,389,433,445))
dat <- read.fwf("https://www.fhwa.dot.gov/bridge/nbi/2020/AL20.txt", widths = widths, comment.char = "")
head(dat, n = 20)
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24
# 1 014000000 00000S702 1610400700905300000 PERDIDO CREEK IRR BIA RTE 4007 11.4KM NW OF ATMORE, AL 999900015000 0000000000 00310610940873413480033626209199902000000502019500970300111150000000000A15505000001000009700147000150000000009700979999N0000N0000008876N2445232777N588 000000021924N N N NA NA NA 0.000000e+00 2019 0 0N2 0100002601010 Y80000592039 0 Y62 G70000014550
# 2 014000000 00000S703 1610400300905303004 PERDIDO CREEK IRR BIA RTE 4003 ON LYNN MCGEE ROAD 999900011000 0000000000 00310620200873408900033626209200202000001592019500650000111150000000000A15501000001000009700129000136003003009100989999N0000N0000008877N2848235176N578 000000021924N N N NA NA NA 0.000000e+00 2019 0 0N2 0100002608010 Y80002002039 0 Y62 G70000013328
# 3 014000000 0000M0022 1810000000011319000 UCHEE CREEK 101ST AIRBORNE RD 3.4 KM S OF SUNSHINE RD 999900000000 0032174330-0845837.0183747409194202000003752017401040000001150000000000A15104000003000007300183000430006006007300909999N0000N0000005566N2517237255N586 381000528101924N Y60 N NA 1017 NA 1.000001e+12 2019 0N2 0000001100350 Y50004002039 NA Y74 F50000038700
# 4 014000000 883039900 1410008800205900000 LITTLE BEAR CR. DAM SPW CO. RD 88 LITTLE BEAR CR DAM 999900000000 0034270600087581200016367670819740200000430201750116099011115N000000000A19505000005000008600140000655000000008701059999N0000N0000007777N1442126665N596 061824N N N NA NA NA NA NA 0N2 0000001000050 Y50006002037 NA Y67 G70000068775
# 5 014000001 014002450 1310010100207900000 TENNESSEE RIVER SR 101 ON WHEELER DAM 45200394290 00344818000872254000423676706193702000055202017400790000100041017900335A55310302002014306000533019812009002006100780452N0000N0000005659N1400123952N588 311019812061924Y24 N N 619 NA NA 5.000000e+16 NA 0N2 00196210001001 Y50090002037 NA Y67 F50001545336
# 6 014000001 331700710 1620000000203300000 TENNESSEE RIVER Res Rd ON WILSON DAM 999900000000 00344800000873730000063676717192402020036202018300790000111111017300335A56403120008007806000469015124017005006100839999H1219H6000005558N15841350529593 311015124061824Y24 N N 618 NA NA 3.500000e+16 NA 0N2 00195815000801 Y50025002038 NA Y67 F50001255292
# 7 014000002 1410000001012700000 BLACKWATER CREEK COUNTRY CLUB ROAD 1 MILE EAST HWY 195 45700000100 000CO00000 00335430300871717700023020209189801000004001991100520000000020000000000P15310302001000403700305000518000000003700370457N0000N0000006457N1027101822N072 311000621101924Y24 N Y03 1019 NA 120 5.110001e+14 2020 0 _ 0N3 0000006600010 Y50005602032 0 N01 P40000019166
# 8 014000004 1410000001012700000 LOST CREEK KINGS MILL ROAD 1 MILE N OAKMAN HWY 40200000100 000CO00000 00334409500872134600133020209190001000001501991200490000000050000000000K15310302001000503700256000482000000003700370402N0000N0000004446N5000500000N082 311000583011324Y24 N Y03 213 NA 113 4.800000e+14 2020 0 _ 0N3 0000006800010 YU0002102033 0 N01 P40000017834
# 9 014000004 504800350 1680000000109500000 LITTLE PAINT ROCK CREEK LOCK ACCESS ROAD 0.3 MI SOUTH US 431 999900056330 00342818000862230001993676709193902000003502018400610000100050000000000A15104000001000007300183000192003003007000829999N0000N0000007776N1477128874N568 121824N N N NA NA NA NA NA 0N2 0000001600050 Y50003752038 NA Y67 G70000015744
# 10 014000005 1500000001007548808 STREAM RAILROAD STREET 0.5 MI E SR#17 999900004990 000MU00040 00333357210880425781583040409190001000000152007100300000000050000000000P15410000002000004400079000155000000004400489999N0000N9990005445N1027101826N045 311000223061924Y24 N Y12 619 NA 619 2.300000e+14 2020 0 _ 0N3 0000008700000 Y40000592032 0 N01 P40000007440
# 11 014000007 1410001000104900000 COAL CREEK CO RT 10 1.25 MI N ARONEY T9SR5E 999900000100 000CO00100 00341403150860413920053020208199301000001802017500580000111150000000000A15122000003000007500073000219000000007500759999N0000N0000007675N1637138265N576 000300031924N N N NA NA NA 4.930000e+14 2020 0 _ 0N3 0000002000050 Y50002682037 0 N01 F60000016425
# 12 014000008 1410002700500700000 VIADUCT OVER SO.RR COUNTY 27 OLD AL 5 AT WOODSTK 999900000100 000CO00270 0033122146087091361010302020919270200000513198920085000000002N000000000P12303104001000306100332000652000000006100709999R0914R183061777NN213620952390N7 311000756101924Y12 N Y12 1019 NA 1019 1.192000e+15 2020 0 _ 0N2 0000001000200 YN0007652030 0 N01 G70000045640
# 13 01400000B 253040000 1410003700205900000 BEAR CREEK DAM SPILLWAY CO. RD 37 BEAR CREEK DAM 999900000000 0034235400087590600021367670819690200000430201750116000010115N000000000A19505000008000008500143001170000000008500969999N0000N0000007677N1489129265N596 061824N N N NA NA NA NA NA 0N2 0000002600050 Y50006002037 NA Y67 F60000112320
# 14 01400000B LN0BR0010 1880000000107100000 TOWN CREEK BELLEFONTE ROAD Bellefonte 999900000000 003443470008554570001136767091974020000025020184012800000N0N50000000000A15119000002000006400030000064000000000000009999N0000N000000NNN76337332886NN577 031924N Y60 N NA 716 NA 0.000000e+00 NA 0N2 000000N6NN010 Y50005002038 NA Y67 F60000008192
# 15 01400000T 721700150 1680000000203300000 CANE CREEK Access Road Colbert Steam Plant 999900000000 003444120008750540019936767091953020000010020184006703001NN050000000000A15104000005000007300162000808006006007300919999N0000N0000006666N1464127966N598 061824N Y60 N NA 819 NA NA NA 0N2 0000001000100 Y50001002038 NA Y67 F60000073528
# 16 014000012 1410009100107109328 SOUTHERNLNRAILROAD JC91 1001E US 72 999900000100 000CO00910 0034563691086433691002304040919000200000495200710073000000002N000000000A12101101003000206100101000363000000006100699999R0678R320000566NN043503276345N6 351000455011924N N Y12 NA NA 120 6.860001e+14 2020 0 _ 0N2 0000001000150 YN0005002039 0 N01 F50000025047
# 17 014000013 771700120 1600000000203300000 BUS BAR TUNNEL- REMOVED TVA RES. ROAD NEAR NAT. FERT. DEV. CTR. 999900000000 0034465400087390000005367671919580200003620201860073000011115N000000000A10302000001000007900177000189006006007900989999N0000N000000677NN1551133073N5N7 331000189061824N N N NA NA NA 1.800000e+14 NA 0N2 0000001600070 YN0045002038 NA Y67 F60000018522
# 18 014000019 1410043800108900000 RODGERS BRANCH JACKS RD 1 MI SE NEW MARKET 999900000100 000CO04380 00345334170862519520053020209190002000002501992200790450000050000000000A15319000002000007600073000146000000000000769999N0000N000000NNN55135712885NN555 311000214081824N N N NA NA NA 4.070000e+14 2020 0 _ 0N2 000000NNNN010 Y80002902035 0 N01 F50000011096
# 19 014000022 1410004500104900000 LAUREL CREEK CO RD 313 1.3MI S GILBERT XRDT8SR7E 999900000100 000CO00450 00342101790855522060053020209190201000000872017000400000000050000000000K15302000001000099900085000085000000003400379999N0000N0000006655N1000100003N048 311000152111724N N Y12 NA NA 1117 1.250000e+14 2020 0 _ 0N3 0000008700000 Y50001292037 0 N01 F50000003145
# 20 014000024 1510000000311707000 LITTLE CAHABA RIVER CAHABA BEACH ROAD 1.8 MILES NORTH OF US 280 47500005000 000MU01630 00332623190864157360063040419190401000001001996000490000000030000000000K15310000001000003500335000335000000003500359999N0000N0000005254N2000200003N066 311000423101924Y24 N Y03 990 NA 996 3.190000e+14 2020 0 _ 0N3 0000008700010 Y60001382019 0 N01 P20000011725要将其分解为必需的列,很容易就会有更多的工作要做。例如,V9看起来很长,并且有某种模式,因此可以很容易地将几个列连在一起。
另外,我还通过数字符为上面的widths创建了数字。我认为有一些工具可以很好地提示如何开始使用FWF文件(在FWF文件中,有些列没有将其与下一列分隔开来的空间),但它从来都不是确定性的。唯一能确定的方法就是知道数据。
https://stackoverflow.com/questions/69304493
复制相似问题