我想合并两个文件CSV数据,但不是所有的数据。例如: a.csv + b.csv,其中b.csv有20个数据。但我只想从这些数据中提取10个数据,然后再取11-20个数据。或前10和第2 10
然后将前10个数据插入a.csv,第二个10个数据也插入到a.csv中--我的问题是,如何只获取特定的总数据?
这是我的代码:
import pandas as pd
df1 = pd.read_csv('testNegatif.csv')
df2 = pd.read_csv('trainNegatif.csv', nrows=10)
output=df1.append(df2)
output.to_csv("output.csv", sep=',')我期望得到我想要的结果,但实际结果是合并所有数据。
发布于 2019-05-19 16:30:07
正如我在评论中提到的,您可以使用nrows
import pandas as pd
df1 = pd.read_csv('testNegatif.csv')
df2 = pd.read_csv('trainNegatif.csv', nrows=10)
output=df1.append(df2)
output.to_csv("output.csv", sep=',')更多选项请参见:csv.html
发布于 2019-05-19 16:10:13
不用潘达斯。读取每个文件的行;从一个文件的数据中添加10行到另一行;将结果写入另一个文件。
with open('a.csv') as f:
data = f.readlines()
with open('b.csv') as f:
bdata = f.readlines()
data.extend(bdata[:10])
with open('output.csv', 'w'):
f.writelines(data)如果文件很大,而且不希望将全部内容读入内存,请使用一些迭代工具函数。
import itertools
with open('a.csv') as a, open('b.csv') as b, open('output.csv', 'w') as out:
first_ten = itertools.islice(b, 10)
for line in itertools.chain(a, first_ten):
out.write(line)假设两个文件都有相同的列数。
发布于 2019-05-19 16:17:20
import pandas as pd
import numpy as np
# Creating two dataframes with data that overlap, so we don't want all of the 'b' data.
# We want to strip off '3,4,5' as they exist in 'a' as well
# ----------Creating the data frames----------
a = [1,2,3,4,5]
b = [3,4,5,6,7,8,9,10]
dfa = pd.DataFrame(a)
dfa.to_csv('one.csv', index=False)
dfb = pd.DataFrame(b)
dfb.to_csv('two.csv', index = False)
# ---------------------------------------------
# --------Reading through the dataframes-------
one = pd.read_csv('one.csv')
two = pd.read_csv('two.csv')
# ---------------------------------------------
# Stripping off the first 3 data of 'two' the list
output = one.append(two[3:])
output.to_csv("output.csv", sep=',', index=False)
# ---------------------------------------------我希望这能回答你的问题。对你来说最重要的部分是output = one.append(two[3:])。有更复杂的方法来做同样的事情,但这是最简单的。
https://stackoverflow.com/questions/56209445
复制相似问题