我试图使用Pandas value_counts作为输出,在python中创建一个条形图。对于背景数据,数据是来自森林火灾数据集的温度测量。下面是我用来获取bin_counts的代码(它是一个Series对象)
def discretizeData(cur_dataset, col_name, num_bins, bin_opts):
'''
cur_dataset: dataset containing values to be discretized
col_name: column name for discretization
bin_opts: array, containing either single value for number of bins, or
discretization_type: type of discretization (either 'equal-width' or 'equal-frequency')
'''
# Select specific data column for binning
data_to_bin = cur_dataset[col_name]
# Choose between equal width or equal frequency
if bin_opts=='equal-frequency':
# If "equal-frequency", we want to use the pandas qcut option for binning
binned_data = pd.qcut(data_to_bin, q=num_bins)
else:
# Use equl-width instead
binned_data = pd.cut(data_to_bin, bins=num_bins)
# Get the bin counts to return; this will be more useful
bin_counts = binned_data.value_counts().sort_index()
return bin_counts下面是一个使用等频率绑定选项运行该函数的示例:
fires = dataset_dict['forestfires']
col_name = 'temp'
num_bins = 5
bin_opts='equal-frequency'输出是一个Pandas对象,其索引为Interval对象,并将该间隔计数为列值。看起来是这样的:
Equal Frequency Binning Example:
(2.1990000000000003, 14.42] 104
(14.42, 17.9] 104
(17.9, 20.6] 106
(20.6, 23.58] 99
(23.58, 33.3] 104
Name: temp, dtype: int64我还尝试将其转换为Pandas DataFrame,使用bin_width作为一列,count作为第二列,但我找不到任何能够处理间隔的绘图库。我已经试过了。有什么建议吗?
发布于 2022-09-22 15:58:37
您需要手动绘制:
df = pd.DataFrame({'value': np.random.normal(10,size=1000)})
counts = discretizeData(df, 'value', 10, 'equal-frequency')
plt.bar([x.left for x in counts.index], counts, width=[x.right - x.left for x in counts.index])输出:

https://stackoverflow.com/questions/73817401
复制相似问题