我使用杜克作为记录链接,在一个基本测试中,我从CSVReader获得了这个异常CSVReader: 1000。
这是我的Java类:
Configuration config = ConfigLoader.load("resources/dukeConfiguration.xml");
Processor proc = new Processor(config);
proc.addMatchListener(new PrintMatchListener(true, true, true, false,
config.getProperties(),
true));
proc.link();
proc.close();这个是配置文件:
<duke>
<schema>
<threshold>0.7</threshold>
<property type="id">
<name>ID</name>
</property>
<property>
<name>TITLE</name>
<comparator>no.priv.garshol.duke.comparators.Levenshtein</comparator>
<low>0.09</low>
<high>0.93</high>
</property>
<property>
<name>ARTIST</name>
<comparator>no.priv.garshol.duke.comparators.Levenshtein</comparator>
<low>0.04</low>
<high>0.73</high>
</property>
</schema>
<group>
<jdbc>
<param name="driver-class" value="com.mysql.jdbc.Driver" />
<param name="connection-string" value="jdbc:mysql://localhost:3306/digitalmusic" />
<param name="user-name" value="root" />
<param name="password" value="root" />
<param name="query" value="select * from inventory" />
<column name="idsong" property="ID" />
<column name="title" property="TITLE" />
<column name="artist" property="ARTIST" />
</jdbc>
</group>
<group>
<csv>
<param name="input-file" value="/home/mongo.csv" />
<param name="header-line" value="false" />
<column name="1" property="ID" />
<column name="2" property="TITLE" />
<column name="3" property="ARTIST" />
</csv>
</group>
</duke>有人知道问题出在哪里?
堆栈跟踪:
Records: 0
Records: 40000
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1000
at no.priv.garshol.duke.utils.CSVReader.next(CSVReader.java:70)
at no.priv.garshol.duke.datasources.CSVDataSource$CSVRecordIterator.findNextRecord(CSVDataSource.java:170)
at no.priv.garshol.duke.datasources.CSVDataSource$CSVRecordIterator.next(CSVDataSource.java:198)
at no.priv.garshol.duke.datasources.CSVDataSource$CSVRecordIterator.next(CSVDataSource.java:111)
at no.priv.garshol.duke.Processor.linkRecords(Processor.java:362)
at no.priv.garshol.duke.Processor.link(Processor.java:319)
at no.priv.garshol.duke.Processor.link(Processor.java:298)
at no.priv.garshol.duke.Processor.link(Processor.java:285)
at duke.DukeCollecting.main(DukeCollecting.java:20)发布于 2016-01-08 17:01:23
好的,这是你的问题。
根据最新消息来源:@ GitHub,当您实例化一个新的CSVReader时,会发生这样的情况:
public CSVReader(Reader in, int buflen, String file) throws IOException {
this.buf = new char[buflen];
this.pos = 0;
this.len = in.read(buf, 0, buf.length);
this.tmp = new String[1000];
this.in = in;
this.separator = ','; // default
this.file = file;}
根据您的堆栈跟踪,错误发生在这个块中:
if (escaped_quote)
tmp[colno++] = unescape(new String(buf, prev + 1, pos - prev - 1));
else
tmp[colno++] = new String(buf, prev + 1, pos - prev - 1);问题是,CSVReader colno比以前分配的1000数组容量要大,因此生成了一个java.lang.ArrayIndexOutOfBoundsException。
这些是您的选择,IMHO:
tmp缓冲区,直到程序运行时没有错误并重新编译;或array overflow的格式错误信息。除非您赶时间,否则我建议使用选项2。
祝好运!
https://stackoverflow.com/questions/34667175
复制相似问题