我有一个相对无效的CSVReader代码,见下文。读取30000+行需要超过30秒。如何加快阅读过程的速度?
public class DataReader {
private String csvFile;
private List<String> sub = new ArrayList<String>();
private List<List> master = new ArrayList<List>();
public void ReadFromCSV(String csvFile) {
String line = "";
String cvsSplitBy = ",";
try (BufferedReader br = new BufferedReader(new FileReader(csvFile))) {
System.out.println("Header " + br.readLine());
while ((line = br.readLine()) != null) {
// use comma as separator
String[] list = line.split(cvsSplitBy);
// System.out.println("the size is " + country[1]);
for (int i = 0; i < list.length; i++) {
sub.add(list[i]);
}
List<String> temp = (List<String>) ((ArrayList<String>) sub).clone();
// master.add(new ArrayList<String>(sub));
master.add(temp);
sub.removeAll(sub);
}
} catch (IOException e) {
e.printStackTrace();
}
System.out.println(master);
}
public List<List> getMaster() {
return master;
}
}更新:我发现如果分开运行,我的代码实际上可以在不到1秒内完成读取工作。因为这个DataReader是我的模拟模型用来初始化相关属性的一部分。下面的部分与使用导入的数据相关联,这需要40秒才能完成!有谁能帮上忙看看密码的通用部分吗?
// add route network
Network<Object> net = (Network<Object>)context.getProjection("IntraCity Network");
IndexedIterable<Object> local_hubs = context.getObjects(LocalHub.class);
for (int i = 0; i <= CSV_reader_route.getMaster().size() - 1; i++) {
String source = (String) CSV_reader_route.getMaster().get(i).get(0);
String target = (String) CSV_reader_route.getMaster().get(i).get(3);
double dist = Double.parseDouble((String) CSV_reader_route.getMaster().get(i).get(6));
double time = Double.parseDouble((String) CSV_reader_route.getMaster().get(i).get(7));
Object source_hub = null;
Object target_hub = null;
Query<Object> source_query = new PropertyEquals<Object>(context, "hub_code", source);
for (Object o : source_query.query()) {
if (o instanceof LocalHub) {
source_hub = (LocalHub) o;
}
if (o instanceof GatewayHub) {
source_hub = (GatewayHub) o;
}
}
Query<Object> target_query = new PropertyEquals<Object>(context, "hub_code", target);
for (Object o : target_query.query()) {
if (o instanceof LocalHub) {
target_hub = (LocalHub) o;
}
if (o instanceof GatewayHub) {
target_hub = (GatewayHub) o;
}
}
// System.out.println(target_hub.getClass() + " " + time);
// Route this_route = (Route) net.addEdge(source_hub, target_hub);
// context.add(this_route);
// System.out.println(net.getEdge(source_hub, target_hub));
if (net.getEdge(source, target) == null) {
Route this_route = (Route) net.addEdge(source, target);
context.add(this_route);
// this_route.setDist(dist);
// this_route.setTime(time); }
}
} 发布于 2019-10-25 05:33:05
在您的代码中,您正在执行许多写操作,只需在主列表中添加当前行的值列表,而这并不是必需的。您可以将现有代码替换为简单的代码,如下所示。
现行守则:
String[] list = line.split(cvsSplitBy);
// System.out.println("the size is " + country[1]);
for (int i = 0; i < list.length; i++) {
sub.add(list[i]);
}
List<String> temp = (List<String>) ((ArrayList<String>) sub).clone();
// master.add(new ArrayList<String>(sub));
master.add(temp);
sub.removeAll(sub);建议的守则:
master.add(Arrays.asList(line.split(cvsSplitBy)));发布于 2019-10-25 05:30:01
通过扩展到@Alex的答案,您还可以并行地处理它,如下所示:
public static void main(String[] args) throws IOException {
Path csvPath = Paths.get("path/to/file.csv");
List<List<String>> master = Files.lines(csvPath)
.skip(1).parallel()
.map(line -> Arrays.asList(line.split(",")))
.collect(Collectors.toList());
}https://stackoverflow.com/questions/58551748
复制相似问题