我有一个非常大的数据集,我正在许多列上应用多过滤器。为了提高代码的可读性,我将筛选器分配给一些变量--但是我注意到,尽管数据文件中的值已经改变,但是过滤器似乎没有考虑到新的值。
这是我的数据
data = {'id':[12, 84, 156, 228, 300, 372, 444, 516, 588, 660, 732],
'age':['18-18', '22-22', '35-35', '33-33', '45-45', '40-40', '55-55', '60-60', '47-47', '25-25', '59-59'],
'height':['175-177', '165-167', '175-178', '165-168', '175-179', '165-169', '175-180', '165-170', '175-181', '165-171', '175-182'],
'weight':['65-70', '65-70', '80-85', '75-80', '90-95', '100-105', '80-85', '70-75', '70-75', '85-90', '90-95'],
'education':['10-12', '11-13', '12-14', '13-15', '14-16', '15-17', '16-18', '17-19', '18-20', '19-21', '20-22'],
'employment':['1-4', '8-11', '8-11', '4-7', '5-8', '5-8', '9-12', '15-18', '13-16', '12-15', '12-15'],
'country':['France-EU', 'Austria-EU', 'Netherland-EU', 'Italy-EU', 'Texas-US', 'California-US', 'Washington-US', 'Poland-EU', 'Spain-EU', 'Greece-EU', 'New York-US'],
'city':['Paris-FR', 'Vienna-AUS', 'Amsterdam-NL', 'Rome-ITA', 'Austin-TX', 'LA-CAL', 'Olympia-WAS', 'Warsaw-PL', 'Madrid-SPA', 'Athens-GR', 'Albany-NY']}
df = pd.DataFrame(data)我想应用这个过滤器:
`df['weight'] = df['weight'].astype(str)
filter1 = (df['weight'].str.slice(stop=2)=='65') & (df['country'].str.slice(stop=2)=='Au')`最初,我使用过滤器得到我想要的东西:
df.loc[filter1]
稍后,我将过滤的行更改如下:
df.loc[filter1,'weight'] = '100'当我再次使用过滤器时,我不希望得到结果,相反,它会返回相同的行,尽管过滤器的值应该是False。
发布于 2022-10-28 17:22:34
filter1不会神奇地更新以匹配在创建它之后设置的值.在您的更改之后再做一次,您将看到它如预期的那样工作:
def get_filter1(df):
return df['weight'].str[:2].eq('65') & df['country'].str[:2].eq('Au')
print(df.loc[get_filter1(df)])
df.loc[get_filter1(df), 'weight'] = '100'
print(df.loc[get_filter1(df)])输出:
id age height weight education employment country city
1 84 22-22 165-167 65-70 11-13 8-11 Austria-EU Vienna-AUS
Empty DataFrame
Columns: [id, age, height, weight, education, employment, country, city]
Index: []https://stackoverflow.com/questions/74238774
复制相似问题