我有一个Dataframe,其中包含区域销售记录,需要根据平均销售额对它们进行群集
Zone Consumption
North 1
South 3
East 10
North 8
North2 0
South 5我使用了下面的代码
def Clustering(row):
if row['Consumption']<.5*np.mean(['Consumption']):
val='E'
elif row['Consumption']<.75*np.mean(['Consumption']):
val='D'
elif row['Consumption']<1*np.mean(['Consumption']):
val='C'
elif row['Consumption']<1.5*np.mean(['Consumption']):
val='B'
elif row['Consumption']<2.5*np.mean(['Consumption']):
val='A'
else:
val='Z'
return val回溯
<ipython-input-21-f08d8263edc0> in Clustering(row)
1 def Clustering(row):
----> 2 if row['Consumption']<.5*np.mean(['Consumption']):
3 val='E'
4 elif row['Consumption']<.75*np.mean(['Consumption']):
5 val='D'
<__array_function__ internals> in mean(*args, **kwargs)
~\anaconda3\lib\site-packages\numpy\core\fromnumeric.py in mean(a, axis, dtype, out, keepdims)
3333
3334 return _methods._mean(a, axis=axis, dtype=dtype,
-> 3335 out=out, **kwargs)
3336
3337
~\anaconda3\lib\site-packages\numpy\core\_methods.py in _mean(a, axis, dtype, out, keepdims)
149 is_float16_result = True
150
--> 151 ret = umr_sum(arr, axis, dtype, out, keepdims)
152 if isinstance(ret, mu.ndarray):
153 ret = um.true_divide(
TypeError: cannot perform reduce with flexible type我的假设是,错误是由于Sales列可能具有一些字符串值而导致的,但事实并非如此,我应该如何解决这个问题。
发布于 2020-08-19 23:03:35
你试过pd.cut吗?假设df['Consumption'].mean() >= 0
# Define the bins, which are double-ended by -INF and INF
bins = np.array([.5, .75, 1, 1.5, 2.5]) * df['Consumption'].mean()
bins = np.hstack((np.NINF, bins, np.inf))
df['Cluster'] = pd.cut(df['Consumption'], bins, labels=list('EDCBAZ')).astype('str')https://stackoverflow.com/questions/63488826
复制相似问题