我有一个文件ua.csv,它有2行,还有一个文件pr.csv,它有4行。我想知道pr.csv和ua.csv中没有的行是什么,需要在输出中有pr.csv中的额外行数。
ua.csvName|Address|City|Country|Pincode
Jim Smith|123 Any Street|Boston|US|02134
Jane Lee|248 Another St.|Boston|US|02130 pr.csvName|Address|City|Country|Pincode
Jim Smith|123 Any Street|Boston|US|02134
Smoet|coffee shop|finland|Europe|3453335
Jane Lee|248 Another St.|Boston|US|02130
Jack|long street|malasiya|Asia|585858预期产出如下:
pr.csv has 2 rows extra
Name|Address|City|Country|Pincode
Smoet|coffee shop|finland|Europe|3453335
Jack|long street|malasiya|Asia|585858发布于 2022-07-18 12:38:29
我想您可以使用set数据结构:
ua_set = set()
pr_set = set()
# Code to populate the sets reading the csv files (use the `add` method of sets)
...
# Find the difference
diff = pr_set.difference(ua_set)
print(f"pr.csv has {len(diff)} rows extra")
# It would be better to not hardcode the name of the columns in the output
# but getting the info depends on the package you use to read csv files
print("Name|Address|City|Country|Pincode")
for row in diff:
print(row)使用pandas模块的更好解决方案:
import pandas as pd
df_ua = pd.read_csv("ua.scv") # Must modify path to ua.csv
df_pr = pd.read_csv("pr.csv") # Must modify path to pr.csv
df_diff = df_pr.merge(df_ua, how="outer", indicator=True).loc[lambda x: x["_merge"] == "left_only"].drop("_merge", axis=1)
print(f"pr.csv has {len(df_diff)} rows extra")
print(df_diff)发布于 2022-07-19 06:47:07
import csv
ua_dic={}
with open('ua.csv') as ua:
data=csv.reader(ua,delimiter=',')
for i in data:
if str(i) not in ua_dic:
ua_dic[str(i)]=1
output=[]
with open('pr.csv') as pr:
data=csv.reader(pr,delimiter=',')
for j in data:
if str(j) not in ua_dic:
output.append(j)
print(output)https://stackoverflow.com/questions/73022341
复制相似问题