首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >java流自定义汇总统计

java流自定义汇总统计
EN

Stack Overflow用户
提问于 2020-07-30 12:10:01
回答 1查看 492关注 0票数 0

下面是一个非常大的CSV文件的示例:

代码语言:javascript
复制
id, type, profit, purchaseDate, soldDate
order1, fruit, 115.50, 1/1/2020, 20/1/2020
order2, veg, 114.25, 7/1/2020, 7/2/2020
order3, flowers, 113.30, 5/1/2020, 15/1/2020
order4, fruit, 111.20, 1/1/2019, 30/1/2019
order5, veg, 112.40, 17/1/2019,10/2/2019

我需要阅读这个非常大的文件,并生成以下统计数据摘要:

  1. 项目利润
  2. 年订单最高的
  3. 平均时间从购买日期到销售日期

我一次可以做一个统计,我使用了commons解析器:

代码语言:javascript
复制
Reader in = new FileReader("filePath");
Iterable<CSVRecord> records = CSVFormat.DEFAULT
                                       .withFirstRecordAsHeader()
                                       .withIgnoreEmptyLines(true)
                                       .withDelimiter(',')
                                       .withTrim()
                                       .parse(in);
StreamSupport
    .stream(records.spliterator(), false)
    .collect(groupingBy(r -> r.get("type"),averagingDouble(r ->  Double.parseDouble(r.get("profit")))));                                    

我想看看我们是否可以使用Java流api获得多个统计数据,只需一次扫描,并且没有内存重载,因为这是一个非常大的文件。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-08-03 21:24:45

代码语言:javascript
复制
public class CustomSummaryStatistics implements Consumer<Order> {

    private Map<String, BigDecimal> itemWiseProfits = new HashMap<>();
    private Optional<Map.Entry<Integer, Integer>> yearWithMaxOrders;
    private int avgDaysBetweenOrderDateAndShipDate;

    private Map<Integer, Integer> yearWiseOrdersMap = new HashMap<>();
    private long totalDaysBetweenOrderDateAndShipDate;
    private long numRecords;

    public static Collector<Order, ?, CustomSummaryStatistics> newCollector() {
        return Collector.of(CustomSummaryStatistics::new, CustomSummaryStatistics::accept,
                CustomSummaryStatistics::combine, CustomSummaryStatistics::finisher);
    }

    @Override
    public void accept(Order order) {
        updateItemWiseProfits(order);
        updateYearWithHighestOrders(order);
        updateAvgDaysBetweenOrderDateAndShipDate(order);
    }

    private void updateItemWiseProfits(Order order) {
        itemWiseProfits.merge(order.getItemType(), order.getTotalProfit(), BigDecimal::add);
    }

    private void updateYearWithHighestOrders(Order order) {
        yearWiseOrdersMap.merge(order.getOrderDate().getYear(), 1, Integer::sum);
    }

    private void updateAvgDaysBetweenOrderDateAndShipDate(Order order) {
        numRecords++;
        totalDaysBetweenOrderDateAndShipDate += Period.between(order.getOrderDate(), order.getShipDate()).getDays();
    }

    public CustomSummaryStatistics combine(CustomSummaryStatistics other) {
        other.itemWiseProfits.forEach((k, v) -> itemWiseProfits.merge(k, v, BigDecimal::add));
        other.yearWiseOrdersMap.forEach((k, v) -> yearWiseOrdersMap.merge(k, v, Integer::sum));
        numRecords += other.numRecords;
        totalDaysBetweenOrderDateAndShipDate += other.totalDaysBetweenOrderDateAndShipDate;
        return this;
    }

    public CustomSummaryStatistics finisher() {
        yearWithMaxOrders = yearWiseOrdersMap.entrySet().stream()
                                .max(Map.Entry.comparingByValue(Comparator.comparing(entry -> entry.longValue())));
        avgDaysBetweenOrderDateAndShipDate = (int) (totalDaysBetweenOrderDateAndShipDate / numRecords);
        return this;
    }

    public Map<String, BigDecimal> getItemWiseProfits() {
        return itemWiseProfits;
    }

    public int getAvgDaysBetweenOrderDateAndShipDate() {
        return avgDaysBetweenOrderDateAndShipDate;
    }

    public Optional<Map.Entry<Integer, Integer>> getYearWithMaxOrders() {
        return yearWithMaxOrders;
    }

    public long getNumRecords() {
        return numRecords;
    }

    @Override
    public String toString() {
        return "CustomSummaryStatistics{" +
                "itemWiseProfits=" + itemWiseProfits +
                ", yearWithMaxOrders=" + yearWithMaxOrders +
                ", avgDaysBetweenOrderDateAndShipDate=" + avgDaysBetweenOrderDateAndShipDate +
                ", numRecords=" + numRecords +
                '}';
    }
}

@Data
@NoArgsConstructor
@AllArgsConstructor
public class Order {
    @NotBlank
    private String itemType;
    @NotNull
    private LocalDate orderDate;
    @NotNull
    private LocalDate shipDate;
    @NotNull
    private BigDecimal totalProfit;
}

public class SalesReport {

    private static final DateTimeFormatter df = DateTimeFormatter.ofPattern("M/d/y");
    private static final Validator validator = Validation.buildDefaultValidatorFactory().getValidator();

    public static void main(String[] args) {
        long start = System.currentTimeMillis();
        CustomSummaryStatistics stats;
        try {
            stats = calculateSummaryStats("data/SalesRecords.csv");
            System.out.println(stats.toString());
        } catch (IOException ioe) {
            ioe.printStackTrace();
        }
        System.out.println("Time(milli-seconds) taken to generate Sales Report : " + (System.currentTimeMillis() - start));
    }

    public static CustomSummaryStatistics calculateSummaryStats(String filePath) throws IOException {
        Reader in = new FileReader(filePath);
        Iterable<CSVRecord> iterable =
                CSVFormat.DEFAULT
                        .withFirstRecordAsHeader()
                        .withIgnoreEmptyLines(true)
                        .withDelimiter(',')
                        .withTrim()
                        .parse(in);
        return StreamSupport
                .stream(iterable.spliterator(), true)
                .map(csvRecord -> toOrder(csvRecord))
                .filter(order -> order != null)
                .collect(CustomSummaryStatistics.newCollector());
    }

    // map and validate
    public static Order toOrder(CSVRecord record) {
        Order order = new Order();
        try {
            order.setItemType(record.get("Item Type"));
            order.setOrderDate(LocalDate.parse(record.get("Order Date"), df));
            order.setShipDate(LocalDate.parse(record.get("Ship Date"), df));
            order.setTotalProfit(new BigDecimal(record.get("Total Profit")));

            //validate
            Set violations = validator.validate(order);
            if (!violations.isEmpty()) throw new Exception("Failed validation:" + violations.toString());

        } catch (Exception e) {
            System.out.println("Error with row: " + record.toString() + e.getMessage());
            return null;
        }
        return order;
    }
}
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/63172890

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档