此函数基于5个不同的标准(开始日期、结束日期、基金、帐户和分析)对数据框中的行求和:
df = pd.DataFrame(
[
['02-09-2019',20190902, 20.00, 'F1','B1','I2'],
['23-09-2019',20190923, 237.36, 'F1','B1','I1'],
['15-11-2019',20191115, 200.00, 'F1','B1','I1'],
['16-11-2019',20191116, 2045.00, 'F1','B1','I2'],
['05-05-2020',20200505, 205.00, 'F2','B2','I1'],
],
columns= ['Datestr','Datenum','Cost','Fund','Account','Analysis'])
def per_sum(startdate, enddate, fund, account, analysis):
return df[(df.Datenum > startdate) &
(df.Datenum < enddate) &
(df.Fund == fund) &
(df.Account == account) &
(df.Analysis == analysis)
].Cost.sum()
per_sum(20190000,20200000,'F1','B1','I1')如果没有资金、账户或分析数据,我如何调整这个函数,以继续计算总成本。
例如:如果我想要找到总分析'I2‘(在所有基金和账户中)。
这类事情不起作用:
per_sum(20190000,20200000,'','','I2')谢谢
发布于 2020-01-09 23:03:15
Idea由bitwise OR的|链接,用于按空格比较的新约定:
def per_sum(startdate, enddate, fund, account, analysis):
return df[(df.Datenum > startdate) &
(df.Datenum < enddate) &
((df.Fund == fund) | (fund == '')) &
((df.Account == account) | (account == '')) &
((df.Analysis == analysis) | (analysis == ''))
].Cost.sum()
print(per_sum(20190000,20200000,'','',''))
2502.36
print(per_sum(20190000,20200000,'','','I2'))
2065.0编辑:
如果还想使用日期时间进行过滤,一种可能解决方案是为更改开始和结束日期时间添加if-else语句:
def per_sum(startdate, enddate, fund, account, analysis):
startdate = -np.inf if startdate == '' else startdate
enddate = np.inf if enddate == '' else enddate
return df[(df.Datenum > startdate) &
(df.Datenum < enddate) &
((df.Fund == fund) | (fund == '')) &
((df.Account == account) | (account == '')) &
((df.Analysis == analysis) | (analysis == ''))
].Cost.sum()
print(per_sum('','','','',''))
2707.36发布于 2020-01-09 23:17:57
这可能不是很优雅,但却是透明和万无一失的:
def per_sum_2(startdate, enddate, fund = None, account=None, analysis=None):
df2 = df[(df.Datenum > startdate) &
(df.Datenum < enddate) ]
if not fund is None:
df2 = df2[df2.Fund == fund]
if not account is None:
df2 = df2[df2.Account == account]
if not analysis is None:
df2 = df2[df2.Analysis == analysis]
return df2.Cost.sum()
per_sum_2(20190000,20200000,analysis='I2')
2065.0发布于 2020-01-09 23:34:49
这一点:
per_sum(20190000,20200000,'','','I2') 不起作用,因为‘’不是适合该列所有大小写/值的通配符。您可以使用正则表达式来匹配列值的所有值。
您可以更改函数声明以包含列的默认值,因此当您想要忽略某个列时,可以在调用函数时do#t给它一个参数。
def per_sum(startdate, enddate, fund='somepattern', account='otherpattern', analysis):https://stackoverflow.com/questions/59666658
复制相似问题