
Stkcd是股票id,Trdmnt是记录时间,Mretwd是该股票数量的增加。我有一个股票excel表,如上图所示,我想分析每个季度(2015-1 2015-2 2015-3 is one quater)的数据,如果Trdmnt数据不是一个完整的四元数据,我将删除不完整的行。例如:
000001 2011-04 0.131841没有
000001 2011-05 xx
000001 2011-06 xx所以我们需要删除
000001 2011-04 0.131841和
000002 2014-12 0.275229
000002 2014-11 0.159574没有
000002 2014-10 xx所以我们需要删除
000002 2014-12 0.275229
000002 2014-11 0.159574.结果如下:

发布于 2015-12-23 03:05:18
我认为这并不难,但我不能使用VBA。所以我曾经解决另一种语言。我希望这段代码能帮助你解决问题。
# -*- coding: utf-8 -*-
import xlrd
if __name__ == "__main__":
book = xlrd.open_workbook('sample.xls')
sheet = book.sheet_by_index(0)
vdict = dict()
for row in range(3, sheet.nrows):
id = sheet.cell(row, 0).value
date = sheet.cell(row, 1).value
year = int(date[0:4])
month = int(date[5:])
value = sheet.cell(row, 2).value
if id not in vdict:
vdict[id] = dict()
if year not in vdict[id]:
vdict[id][year] = dict()
vdict[id][year][month] = value
for id in sorted(vdict):
for year in sorted(vdict[id]):
months = [i for i in vdict[id][year]]
q1 = [x for x in months if (x - 1) / 4 == 0]
q2 = [x for x in months if (x - 1) / 4 == 1]
q3 = [x for x in months if (x - 1) / 4 == 2]
q4 = [x for x in months if (x - 1) / 4 == 3]
if not len(q1) == 4:
for x in q1:
del vdict[id][year][x]
if not len(q2) == 4:
for x in q2:
del vdict[id][year][x]
if not len(q3) == 4:
for x in q3:
del vdict[id][year][x]
if not len(q4) == 4:
for x in q4:
del vdict[id][year][x]
for id in sorted(vdict):
for year in sorted(vdict[id]):
for month in sorted(vdict[id][year]):
value = vdict[id][year][month]
print '%s, %d-%02d, %s' %(id, year, month, value)发布于 2015-12-23 13:22:11
下面是可以更新.xls文件并输出.csv文件的链接,它可以自动解决问题!http://naoya.tuntunkun.org/
https://stackoverflow.com/questions/34391439
复制相似问题