我有两个csv文件,我正在通过python生成。记录如下(a.csv和b.csv)。b.csv有2行,第二行中的值可以是重复的。我想要一个像final.csv一样的结果。我怎么能这么做?
我尝试了下面的代码,但这是不对的。我没有做正确的比较。任何帮助都会很好。
a.csv
"all","1","1Gi","4","8Gi"
"als","0","0","100m","128Mi"
"awx","6","9Gi","20","32Gi"
"cho-1","9","9728Mi","15","20Gi"
"cho-2","12250m","15395Mi","20","24Gi"b.csv
"all","ABC"
"als","ABC"
"awx","DPL"
"cho-1","ABC"
"cho-2","ABC"
"cho-3","ABC"我想从这两个文件中创建一个文件,如下所示
final.csv
"all","1","1Gi","4","8Gi","ABC"
"als","0","0","100m","128Mi","ABC"
"awx","6","9Gi","20","32Gi","DPL"
"cho-1","9","9728Mi","15","20Gi","ABC"
"cho-2","12250m","15395Mi","20","24Gi","ABC"我的代码:
csv1 = csv.reader(open("reports/a.csv", "r"))
csv2 = csv.reader(open("reports/b.csv", "r"))
s=[]
while True:
try:
line1 = csv1.next()
line2 = csv2.next()
if (line1[0] == line2[0]):
s.append([line1[1], line2[0], line2[1], line2[2], line2[3], line2[4]])
else:
s.append(["NA", line2[0], line2[1], line2[2], line2[3], line2[4]])
except StopIteration:
break发布于 2017-11-08 08:05:29
在这种情况下我得到了熊猫的帮助。
df0 = pd.read_csv("a.csv")
df1 = pd.read_csv("b.csv")
df1=df1.dropna(axis=1)
df1 = df1.merge(df0, on='Name', how='outer')
df1.to_csv("final.csv", index=True)发布于 2017-11-07 09:49:54
从您的预期输出来看,我认为您应该使用set。因为line1和line2变量包含逗号分隔的值,所以可以根据这些值创建一个列表。喜欢,
line1 = ["all","1","1Gi","4","8Gi"]
line2 = ["all","ABC"]然后,您可以将这两个列表合并成一个列表并从中生成一个集合。所以片场看起来像,
set1 = set(line1.extend(line2))生成一个集合将移除副本。希望这能有所帮助。
发布于 2017-11-08 08:54:18
您距离解决方案不远,您只需将数据从line2添加到line1并使用它:
...
csvout = csv.writer(open("final.csv", "wb"), quoting = csv.QUOTE_ALL)
while True:
try:
line1 = csv1.next()
line2 = csv2.next()
if line1[0] != line2[0]: # control same first field
raise Exception("Desynch", line1[0], '#', line2[0])
line1.append(line2[1]) # append field from b.csv
csvout.writerow(line1) # and write it to final.csv
except StopIteration:
breakhttps://stackoverflow.com/questions/47153575
复制相似问题