如果满足多种条件,我将使用熊猫和np.where来填充一个新的专栏。
为此,我使用了以下数据库(但后面的数据库要大100倍)。

我现在所做的是:
df['new_column'] = np.where((df['year'] == 2018) & (df['price'] > 30000) & (df['fuel description'] == Petrol), 12, 10)
df['new_column'] = np.where((df['year'] == 2019) & (df['price'] > 30000) & (df['fuel description'] == Petrol), 15, 10)
df['new_column'] = np.where((df['year'] == 2020) & (df['price'] > 30000) & (df['fuel description'] == Petrol), 18, 10)
df['new_column'] = np.where((df['year'] == 2021) & (df['price'] > 30000) & (df['fuel description'] == Petrol), 21, 10)
df['new_column'] = np.where((df['year'] == 2022) & (df['price'] > 30000) & (df['fuel description'] == Petrol), 24, 10)如您所见,我只是在更改列的条件:“年份”。
我正在寻找一个有效的方式来使用其他两个条件(价格和燃料描述),因为我只是复制他们现在。
期待你的回答!
发布于 2022-11-17 08:06:30
尝尝这个,
condition = (df['year'] >= 2018) & (df['year'] <= 2022) & (df['price'] > 30000) \
& (df['fuel description'] == Petrol)
df['new_column'] = np.where(condition, 12 + (df['year']-2018)*3, 10)更新
如果要填充的值之间没有关联,则可以预先构造一个数组。条件仍然只评估一次。
示例:
year_to_value = {2018: 12, 2019: 15, 2020: 18, 2021: 21, 2022: 24}
values = df['year'].map(year_to_value)
df['new_column'] = np.where(condition, values, 10)发布于 2022-11-17 08:15:23
您可以将重复的代码打包到函数中以避免重复,如下所示:
def get_year_condition(df, year):
return df['year'] == year & df['price'] > 30000 & df['fuel description'] == 'Petrol'然后像这样使用:
df['new_column'] = np.where(get_year_condition(df, 2021), 21, 10)
df['new_column'] = np.where(get_year_condition(df, 2022), 24, 10)
...https://stackoverflow.com/questions/74471962
复制相似问题