首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >比较两个文本文件,找出列表中的差异,并找出哪些列表值不匹配

比较两个文本文件,找出列表中的差异,并找出哪些列表值不匹配
EN

Stack Overflow用户
提问于 2019-08-08 05:42:32
回答 2查看 49关注 0票数 1

我正在使用扫描仪读取2个文本文件(可能包含重复文件),并将它们写入arraylist。我正在比较这两个数组列表,以找出差异。当我打印输出时,我可以看到有什么不同,但我不知道哪条记录来自哪个文件(文本文件名)

text1.txt中的内容

代码语言:javascript
复制
TIMESTAMP,FE,TDI,20190703113119,20190601000000,20190701000000,
TIMESTAMP,FE,KYMI,20190703113130,20190601000000,20190701000000,
TIMESTAMP,FE,UMRI,20190703113154,20190601000000,20190701000000,
TIMESTAMP,FE,MLI,20190703113211,20190601000000,20190701000000,
TIMESTAMP,FE,WOLI,20190703113221,20190601000000,20190701000000,
TIMESTAMP,FE,VEM,20190703113221,20190601000000,20190701000000,
TIMESTAMP,FE,ZER,20190703113154,20190601000000,20190701000000,

text2.txt中的内容

代码语言:javascript
复制
TIMESTAMP,FE,TDL,20190703113119,20190601000000,20190701000000,
TIMESTAMP,FE,KYMA,20190703113130,20190601000000,20190701000000,
TIMESTAMP,FE,UMRC,20190703113154,20190601000000,20190701000000,
TIMESTAMP,FE,MLW,20190703113211,20190601000000,20190701000000,
TIMESTAMP,FE,WOLF,20190703113221,20190601000000,20190701000000,
TIMESTAMP,FE,VEM,20190703113221,20190601000000,20190701000000,
TIMESTAMP,FE,ZER,20190703113154,20190601000000,20190701000000,

代码:

代码语言:javascript
复制
Scanner prodScanner = new Scanner(prodFile);
     while (prodScanner.hasNextLine()) {
     String currentRecord = prodScanner.nextLine().trim(); 
                    if (currentRecord.length() > 0) {
                    prodRecordsFromStatement.add(currentRecord);
                  }
           }
Scanner nonProdScanner = new Scanner(nonProdFile);
while (nonProdScanner.hasNextLine()) {
            String currentRecord = nonProdScanner.nextLine().trim();  
            if (currentRecord.length() > 0) {                                   
     nonProdRecordsFromStatement.add(currentRecord);
                                }
                            }
Collection<String> result = new ArrayList<>(CollectionUtils.disjunction(prodRecordsFromStatement, nonProdRecordsFromStatement));
 List<String> resultList = new ArrayList<>(result);
 Collections.sort(resultList);

实际结果:

代码语言:javascript
复制
TIMESTAMP,FE,KYMA,20190703113130,20190601000000,20190701000000,
TIMESTAMP,FE,KYMI,20190703113130,20190601000000,20190701000000,
TIMESTAMP,FE,MLI,20190703113211,20190601000000,20190701000000,
TIMESTAMP,FE,MLW,20190703113211,20190601000000,20190701000000,
TIMESTAMP,FE,TDI,20190703113119,20190601000000,20190701000000,
TIMESTAMP,FE,TDL,20190703113119,20190601000000,20190701000000,
TIMESTAMP,FE,UMRC,20190703113154,20190601000000,20190701000000,
TIMESTAMP,FE,UMRI,20190703113154,20190601000000,20190701000000,
TIMESTAMP,FE,WOLF,20190703113221,20190601000000,20190701000000,
TIMESTAMP,FE,WOLI,20190703113221,20190601000000,20190701000000,

预期结果:我希望显示文件/列表的名称以便于理解

代码语言:javascript
复制
text2.txt,TIMESTAMP,FE,KYMA,20190703113130,20190601000000,20190701000000,
text1.txt,TIMESTAMP,FE,KYMI,20190703113130,20190601000000,20190701000000,
text1.txt,TIMESTAMP,FE,MLI,20190703113211,20190601000000,20190701000000,
text2.txt,TIMESTAMP,FE,MLW,20190703113211,20190601000000,20190701000000,
text1.txt,TIMESTAMP,FE,TDI,20190703113119,20190601000000,20190701000000,
text2.txt,TIMESTAMP,FE,TDL,20190703113119,20190601000000,20190701000000,
text2.txt,TIMESTAMP,FE,UMRC,20190703113154,20190601000000,20190701000000,
text1.txt,TIMESTAMP,FE,UMRI,20190703113154,20190601000000,20190701000000,
text2.txt,TIMESTAMP,FE,WOLF,20190703113221,20190601000000,20190701000000,
text1.txt,TIMESTAMP,FE,WOLI,20190703113221,20190601000000,20190701000000,
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-08-08 06:43:37

遍历resultList检查,查看当前项是否也在prodRecordsFromStatement中。

如果是,则它来自文件1,否则它来自文件2。

票数 1
EN

Stack Overflow用户

发布于 2019-08-08 06:53:47

您的解决方案需要有多高的性能?如果性能不是非常关键,并且您的列表也不长,那么您可以切换到使用subtract而不是析取。

例如。

代码语言:javascript
复制
Collection<String> resultProdRecords = new ArrayList<>(CollectionUtils.subtract(prodRecordsFromStatement, nonProdRecordsFromStatement));
Collection<String> resultNonProdRecords = new ArrayList<>(CollectionUtils.subtract(prodRecordsFromStatement, nonProdRecordsFromStatement));

resultProdRecords将包含prodRecordsFromStatement中不在nonProdRecordFromStatement中的所有行。

resultNonProdRecords将包含nonProdRecordFromStatement中不在prodRecordsFromStatement中的所有行。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/57402645

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档