我有这个功能,我正在尝试将其应用于dask数据帧,该数据帧在假设某些存储容量和速率限制的情况下计算冷却。它使用15分钟的时间步长值对建筑物进行冷却,并返回某个存储率可以容纳的量。
def cooling_kwh_by_case(row, storage_capacity, storage_rate):
if ((row['daily_cooling_kwh'] <= storage_capacity/row['cop']) & (row['max_cooling_kw'] <= storage_rate/row['cop'])):
return row['daily_cooling_kwh']
elif ((row['daily_cooling_kwh'] <= storage_capacity/row['cop']) & (row['max_cooling_kw'] > storage_rate/row['cop'])):
daily_groupby = net_load_w_times.groupby('index')['electricity_cooling_kwh'].apply(lambda x: sum(min(x,storage_rate/(4*row['cop']))))
return daily_groupby.loc[(row.building_date)]
else:
n_largest = 1
daily_groupby = net_load_w_times.groupby('index')['electricity_cooling_kwh'].apply(lambda x: x.nlargest(n_largest).sum())
while ((daily_groupby.loc[(row.building_date)]) <= (storage_capacity/row['cop'])) & (n_largest < net_load_w_times.groupby('index')['electricity_cooling_kwh'].count()):
n_largest += 1
daily_groupby = net_load_w_times.groupby('index')['electricity_cooling_kwh'].apply(lambda x: x.nlargest(n_largest).sum())
return min(storage_capacity/row['cop'],net_load_w_times.groupby('index')['electricity_cooling_kwh'].apply(lambda x: x.nlargest(n_largest-1).sum()).loc[(row.building_date)])当我应用它时,这是我的错误消息。
<ipython-input-22-88e243d194c6> in cooling_kwh_by_case()
16 n_largest = 1
17 daily_groupby = net_load_w_times.groupby('index')['electricity_cooling_kwh'].apply(lambda x: x.nlargest(n_largest).sum())
---> 18 while ((daily_groupby.loc[(row.building_date)]) <= (storage_capacity/row['cop'])) & (n_largest < net_load_w_times.groupby('index')['electricity_cooling_kwh'].count()):
19 n_largest += 1
20 daily_groupby = net_load_w_times.groupby('index')['electricity_cooling_kwh'].apply(lambda x: x.nlargest(n_largest).sum())
ValueError: Not all divisions are known, can't align partitions. Please use `set_index` to set the index.我认为我遇到的问题是我试图计算else语句的值的方式,这是冷却kwh大于storage_capacity参数的情况。为了计算这个值,我应用了一个函数来查找当天最大的15分钟制冷kwh值之和何时超过了storage_capacity。然后我返回最大值的和。
我试图在函数中按groupby返回值的数据帧称为net_load_w_times:
time electricity_cooling_kwh \
building_id
2 2016-07-05 19:00:00 0.050000
2 2016-07-05 22:00:00 3.200000
2 2016-07-05 16:00:00 5.779318
2 2016-07-05 20:00:00 1.888300
2 2016-07-05 18:00:00 7.490000
electricity_heating_kwh total_site_electricity_kwh iso_zone \
building_id
2 0.000000 19.529506 MISO-E
2 0.045235 6.310719 MISO-E
2 0.000000 22.514705 MISO-E
2 0.018624 13.474863 MISO-E
2 0.005464 18.192927 MISO-E
index date
building_id
2 2|2016-10-24 2016-10-24
2 2|2016-03-05 2016-03-05
2 2|2016-08-14 2016-08-14
2 2|2016-03-05 2016-03-05
2 2|2016-03-05 2016-03-05 所需输出:
给定cooling_kwh_by_case(row, 8, 5),它会输出:
7.717618,因为这是最大冷却kWh,它可以相加到8。
发布于 2020-08-08 09:10:13
Dask数据帧是惰性的,并且不像if-else语句或for循环那样在控制流中工作。我建议尝试在pandas API中找到解决方案,比如where方法。
https://stackoverflow.com/questions/63224119
复制相似问题