大家好,我是Python的新用户,我在做我认为是相当基本的任务时遇到了问题。
我有几个(>50)个包含每日积雪深度数据的csv文件。我想遍历csv文件并计算积雪深度的月平均值。数据示例:
Date,SD
1/1/2000,36
1/2/2000,36
1/3/2000,38
1/4/2000,40
2/1/2000,48
2/2/2000,48换句话说,我想计算每月的积雪深度平均值,并将输出写入一个新的csv文件。我能够为我的数据修改一个不同的代码示例,但由于在我的Dictionary中使用Date作为键值,我收到了关键错误。
有什么建议吗?
到目前为止的代码:
from __future__ import division
import csv
from collections import defaultdict
def default_factory():
return [0, None, None, 0]
reader = csv.DictReader(open(r'C:\SandBox\VALIDATION\TestTable.csv'))
dates = defaultdict(default_factory)
for row in reader:
sd = int(row["SD"])
dates[row["Dates"]][0] += sd
max = dates[row["Dates"]][1]
dates[row["Dates"]][1] = amount if max is None else amount if amount > max else max
min = dates[row["Date"]][2]
dates[row["Dates"]][2] = amount if min is None else amount if amount < min else min
dates[row["Dates"]][3] += 1
for date in dates:
dates[date][3] = dates[date][0]/dates[date][3]
writer = csv.writer(open(r'C:\SandBox\VALIDATION\TestAvg.csv', 'w', newline = ''))
writer.writerow(["Date", "SD", "max", "min", "mean"])
writer.writerows([date] + dates[date] for date in dates)编辑:只是为了澄清,我正在尝试实现每月的总平均值,即一月平均值,二月平均值,等等。不计算单个日期的平均值。
发布于 2012-03-31 05:13:08
您可能希望使用字典来使代码更具可读性。
from __future__ import division
import csv
from collections import defaultdict
def default_factory():
return { "sum": 0, "max": None, "min": None, "count": 0}
reader = csv.DictReader(open(r'sd.csv'))
dates = defaultdict(default_factory)
rows = []
for row in reader:
date = row["Date"]
sd = int(row["Snowdepth"])
rows.append([date, sd])
month = date.split("/")[0]
r = dates[month]
r["sum"] += sd
max = r["max"]
r["max"] = sd if max is None else sd if sd > max else max
min = r["min"]
r["min"] = sd if min is None else sd if sd < min else min
r["count"] += 1
for date in dates:
r = dates[date]
r["avg"] = r["sum"]/r["count"]
writer = csv.writer(open(r'TestAvg.csv', 'w'))
writer.writerow(["Date", "SD", "max", "min", "mean"])
for row in rows:
r = dates[row[0].split("/")[0]]
writer.writerow(row + [r["max"], r["min"], r["avg"]])发布于 2012-03-31 04:47:56
有些地方你使用Dates作为列名(例如max = dates[row["Dates"]][1]),另一些地方使用Date (例如min = dates[row["Date"]][2]),从你的数据示例中看起来Date是列名?所以,如果你在任何地方都使用相同的名字,这应该是可以的。
s="""Date,Snowdepth
1/1/2000,36
1/2/2000,36
1/3/2000,38
1/4/2000,40
2/1/2000,48
2/2/2000,48"""
import StringIO
import csv
reader = csv.DictReader(StringIO.StringIO(s))
for row in reader:
print row['Date']输出:
1/1/2000
1/2/2000
1/3/2000
1/4/2000
2/1/2000
2/2/2000发布于 2012-03-31 04:53:36
from __future__ import division
import csv
from collections import defaultdict
def default_factory():
return [0, None, None, 0]
reader = csv.DictReader(open(r'snow_data.csv'))
dates = defaultdict(default_factory)
for row in reader:
amount = int(row["Snowdepth"])
dates[row["Date"]][0] += amount
max = dates[row["Date"]][1]
dates[row["Date"]][1] = amount if max is None else amount if amount > max else max
min = dates[row["Date"]][2]
dates[row["Date"]][2] = amount if min is None else amoun if amount < min else min
dates[row["Date"]][3] += 1
for date in dates:
dates[date][3] = dates[date][0]/dates[date][3]
writer = csv.writer(open(r'TestAvg.csv', 'w'))
writer.writerow(["Date", "Snowdepth", "max", "min", "mean"])
writer.writerows([date] + dates[date] for date in dates)我修复了代码,以便在任何地方使用Date和Snowdepth,这就是你的示例csv提供的。另外,您有一个变量amount,它应该是sd,否则就没有定义amount。我随处可见的amount。
它不会给出非常令人兴奋的结果,除非你对一个日期有多个条目。
例如,以下是示例csv的输出:
Date,Snowdepth,max,min,mean
1/3/2000,38,38,38,38.0
2/2/2000,48,48,48,48.0
2/1/2000,48,48,48,48.0
1/4/2000,40,40,40,40.0
1/1/2000,36,36,36,36.0
1/2/2000,36,36,36,36.0https://stackoverflow.com/questions/9949974
复制相似问题