我有一个有70列的dataframe。我试图使用df.quantile()函数沿轴= 1计算分位数。
> print(df.head(4))
WS_653 WS_654 WS_655 WS_658 \
ts
2020-11-01 01:00:00 12.3708 11.7133 12.2125 12.3325
2020-11-01 01:10:00 12.6442 12.1883 12.5625 12.3233
2020-11-01 01:20:00 12.8042 11.7109 11.8765 12.1134
2020-11-01 01:30:00 12.3176 10.6824 11.8361 11.5672
WS_656 WS_657 WS_664 WS_659 \
ts
2020-11-01 01:00:00 12.0217 11.6233 12.6108 12.2458
2020-11-01 01:10:00 13.0342 12.5917 12.5225 11.7658
2020-11-01 01:20:00 11.6042 10.6496 11.8874 12.3613
2020-11-01 01:30:00 11.3118 9.98403 10.6 10.5992
WS_663 WS_666 ... WS_715 \
ts ...
2020-11-01 01:00:00 15.3058 15.1433 ... 12.9008
2020-11-01 01:10:00 15.3283 15.0625 ... 12.6042
2020-11-01 01:20:00 15.3765 15.058 ... 11.7462
2020-11-01 01:30:00 14.7689 14.4992 ... 11.0294
[4 rows x 70 columns]> q10 = df.quantile(0.1, axis = 1)
> print(q10)ts
2020-11-01 01:00:00 NaN
2020-11-01 01:10:00 NaN
2020-11-01 01:20:00 NaN
2020-11-01 01:30:00 NaN
2020-11-01 01:40:00 NaN
..
2020-12-01 00:00:00 NaN
2020-12-01 00:10:00 NaN
2020-12-01 00:20:00 NaN
2020-12-01 00:30:00 NaN
2020-12-01 00:40:00 NaN
Name: 0.1, Length: 4319, dtype: float64但是,如果我循环如下:
> q10 = list()
> for k in range(len(df)):
q10.append(df.iloc[k,:].quantile(0.1))
> print(q10)它打印一个大小为len(df)的列表,其中包含对应于每一行的正确的分位数值。因此,希望了解为什么在相同的df上按行操作,而不对整个dataframe进行操作。
发布于 2020-12-26 04:04:52
您的列不是float数据类型。
可以为仅为数据类型为“float64”的列进行索引
cols = [col for col in df.columns if df[col].dtype == 'float64']
df[cols].astype(float).quantile(0.1, axis = 1)示例输出(问题中的第二组4行):
ts
2020-11-01 01:00:00 11.74282
2020-11-01 01:10:00 11.99281
2020-11-01 01:20:00 10.93598
2020-11-01 01:30:00 10.168581
Name: 0.1, dtype: float64或者,您可以将对象列(使用dtype 'O')更改为使用pd.to_numeric()浮动。这将导致不同的结果,因为您强制所有列浮动,并对任何字符串的值返回NaN:
cols = [col for col in df.columns if df[col].dtype == 'O']
for col in cols:
df[col] = pd.to_numeric(df[col], errors='coerce')
df.quantile(0.1, axis = 1)https://stackoverflow.com/questions/65453348
复制相似问题