我已经创建了一个条件集,如果条件是否满足,它将返回1/ 0。
avg_ActivityScore = company['ActivityScore'].median()
min_EmployeeLowerBound = 10
list_LegalFormIDs = [112, 121, 301, 118, 141, 703, 111, 705, 921, 117, 361, 391, 711]
min_CompaniesCount = 10
def flag_company(df):
if (df['ActivityScore'] >= avg_ActivityScore):
return 1
elif (df['EmployeeLowerBound'] >= min_EmployeeLowerBound):
return 1
elif (df['LegalFormID'].isin(list_LegalFormIDs)):
return 1
else:
return 0然后我在DataFrame上应用这个函数,如下所示:
df['Flag'] = df.apply(flag_company, axis = 1)然而,它返回一个错误消息- int‘object没有属性'isin’。有什么想法,我可以改变,以保持功能,请?
如果我使用下面的代码,它可以正常工作,不会出现任何问题:
df.loc[df['LegalFormID'].isin(list_LegalFormIDs)]非常感谢!
发布于 2022-01-18 13:01:01
在DataFrame.apply中使用标量,因此不能将函数用于Series,因为df['LegalFormID']是标量内的函数:
def flag_company(df):
print (df['ActivityScore'])
if (df['ActivityScore'] >= avg_ActivityScore):
return 1
elif (df['EmployeeLowerBound'] >= min_EmployeeLowerBound):
return 1
#check scalar by in
elif (df['LegalFormID'] in list_LegalFormIDs):
return 1
else:
return 0与Series一起工作的矢量化解决方案是:
m1 = df['ActivityScore'] >= avg_ActivityScore
m2 = df['EmployeeLowerBound'] >= min_EmployeeLowerBound
m3 = df['LegalFormID'].isin(list_LegalFormIDs)
df['Flag'] = (m1 | m2 | m3).astype(int)https://stackoverflow.com/questions/70755998
复制相似问题