我有一项工作/任务是read sub-folders/directory of a given folder/path。路径是动态的,我们从Controller获取。目前,我使用的是Tasklet,有3个微线程,一个用来读取子目录,另一个用来处理它,准备保存到数据库的对象,最后一个用来将处理后的数据对象写入数据库。文件夹可以有任意数量的子文件夹。目前,我使用了以下代码:
Path start = Paths.get("x:\\data\\");
Stream<Path> stream = Files.walk(start, 1);
List<String> collect = stream
.map(String::valueOf)
.sorted()
.collect(Collectors.toList());一次读取所有子文件夹。为此,我遵循了Tasklet实现的这个https://www.baeldung.com/spring-batch-tasklet-chunk示例。这是正确的方法吗?我还需要使用多线程异步运行作业。由于可能有大量的子文件夹,因此可以列出行和列表of data to process and write to the database.
请建议一种适当的方法。我正在学习Spring Batch,在file read/process/write上也做了一些例子,并为此使用了Chunk方法。但我的工作是读取文件夹/路径的子目录,因此我无法决定采用哪种方法。
发布于 2020-10-21 22:39:52
我有一个类似的场景:我需要从一个文件夹中读取所有文件,处理并写入db (Doc)
@Configuration
@EnableBatchProcessing
public class BatchConfig {
@Bean
public Job job(JobBuilderFactory jobBuilderFactory,
Step masterStep) {
return jobBuilderFactory.get("MainJob")
.incrementer(new RunIdIncrementer())
.flow(masterStep)
.end()
.build();
}
@Bean
public Step mainStep(StepBuilderFactory stepBuilderFactory,
JdbcBatchItemWriter<Transaction> writer,
ItemReader<String> reader,
TransactionItemProcessor processor) {
return stepBuilderFactory.get("Main")
.<String, Transaction>chunk(2)
.reader(reader)
.processor(processor)
.writer(writer)
**.taskExecutor(jobTaskExecutor())**
.listener(new ItemReaderListener())
.build();
}
@Bean
public TaskExecutor jobTaskExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(2);
taskExecutor.setMaxPoolSize(10);
taskExecutor.afterPropertiesSet();
return taskExecutor;
}
@Bean
@StepScope
public ItemReader<String> reader(@Value("#{stepExecution}") StepExecution stepExecution) throws IOException {
Path start = Paths.get("D:\\test");
List<String> inputFile = Files.walk(start, 1)
.map(String::valueOf)
.sorted()
.collect(Collectors.toList());
return new IteratorItemReader<>(inputFile);
}
@Bean
@StepScope
public TransactionItemProcessor processor(@Value("#{stepExecution}") StepExecution stepExecution) {
return new TransactionItemProcessor();
}
@Bean
@StepScope
public JdbcBatchItemWriter<Transaction> writer(DataSource dataSource) {
return new JdbcBatchItemWriterBuilder<Transaction>()
.itemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>())
.sql("INSERT INTO transaction (id, date, type) VALUES (:id, :date, :type)")
.dataSource(dataSource)
.build();
}}
https://stackoverflow.com/questions/64465361
复制相似问题