我正在使用Pandas分析一个数据集,其中包括一个名为“取食年龄”(浮动数字)的列。我一直试图使用我编写的函数将数据进一步分类为几个年龄较小的桶。但是,我一直得到'str‘和"'<=’实例之间不支持的错误'int'".我要怎么解决这个问题?
My函数:
def convert_age(num):
if num <=7:
return "0-7 days"
elif num <= 21:
return "1-3 weeks"
elif num <= 42:
return "3-6 weeks"
elif num <= 84:
return "7-12 weeks"
elif num <= 168:
return "12 weeks - 6 months"
elif num <= 365:
return "6-12 months"
elif num <= 730:
return "1-2 years"
elif num <= 1095:
return "2-3 years"
else:
return "3+ years"
df['Age on Intake'] = df['Age on Intake'].apply(convert_age)df‘年龄’栏包括浮动数字:
0 95.0
1 1096.0
2 111.0
3 111.0
4 397.0
...
21474 NaN
21475 NaN
21476 365.0
21477 699.0
21478 61.0
Name: Age on Intake, Length: 21479, dtype: float64我得到的错误消息:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-31-ca12621d6b19> in <module>
22 return "3+ years"
23
---> 24 df['Age on Intake'] = df['Age on Intake'].apply(convert_age)
25
26
/opt/anaconda3/lib/python3.8/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
4198 else:
4199 values = self.astype(object)._values
-> 4200 mapped = lib.map_infer(values, f, convert=convert_dtype)
4201
4202 if len(mapped) and isinstance(mapped[0], Series):
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-31-ca12621d6b19> in convert_age(num)
3 def convert_age(num):
4
----> 5 if num <=7:
6 return "0-7 days"
7 elif num <= 21:
TypeError: '<=' not supported between instances of 'str' and 'int'发布于 2021-07-23 14:12:01
似乎您的num参数在convert_age中接收的是字符串而不是int值。尝试将其转换为int,或忽略int以外的传入值。考虑添加类似的内容。
try:
converted_num = int(num) # Trying to convert to int.
# your code goes here
except ValueError:
# We got an error while converting.
pass发布于 2021-07-23 14:20:37
不直接回答问题(这个问题确实是一个可以用astype解决的类型问题),但是您可以有利地用pandas.cut替换不友好的函数
import numpy as np
import pandas as pd
ages = {0: '0-7 days',
7: '1-3 weeks',
21: '3-6 weeks',
42: '7-12 weeks',
84: '12 weeks - 6 months',
168: '6-12 months',
365: '1-2 years',
730: '2-3 years',
1095: '3+ years'}
pd.cut(df['Age on Intake'].astype(float),
bins=list(ages.keys())+[np.inf],
labels=ages.values(),
)https://stackoverflow.com/questions/68500403
复制相似问题