我想用Matplotlib绘制一个直方图,但是我想要回收箱的值来表示总观察值的百分比。MWE应该是这样的:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import matplotlib.ticker as tck
import seaborn as sns
import numpy
sns.set(style='dark')
imagen2 = plt.figure(1, figsize=(5, 2))
imagen2.suptitle('StackOverflow Matplotlib histogram demo')
luminance = numpy.random.randn(1000, 1000)
# "Luminance" should range from 0.0...1.0 so we normalize it
luminance = (luminance - luminance.min())/(luminance.max() - luminance.min())
top_left = plt.subplot(121)
top_left.imshow(luminance)
bottom_left = plt.subplot(122)
sns.distplot(luminance.flatten(), kde_kws={"cumulative": True})
# plt.savefig("stackoverflow.pdf", dpi=300)
plt.tight_layout(rect=(0, 0, 1, 0.95))
plt.show()这里的CDF是OK (范围: 0,1),但结果的直方图不符合我的预期:

为什么直方图的结果在0,4范围内?有办法解决这个问题吗?
发布于 2018-04-11 20:06:13
tel's answer is great!,我只是想提供一个替代方案,给你你想要的直方图,用较少的线条。关键思想是在matplotlib hist函数中使用hist参数来规范计数。可以用以下三行代码替换sns.distplot(luminance.flatten(), kde_kws={"cumulative": True}):
lf = luminance.flatten()
sns.kdeplot(lf, cumulative=True)
sns.distplot(lf, kde=False,
hist_kws={'weights': numpy.full(len(lf), 1/len(lf))})

如果要查看第二个y轴上的直方图(更好的视觉效果),请将ax=bottom_left.twinx()添加到sns.distplot中。

发布于 2018-04-11 18:45:28
你认为你想要什么
下面是如何绘制柱状图,使回收箱之和为1:
import matplotlib.pyplot as plt
import matplotlib.ticker as tck
import seaborn as sns
import numpy as np
sns.set(style='dark')
imagen2 = plt.figure(1, figsize=(5, 2))
imagen2.suptitle('StackOverflow Matplotlib histogram demo')
luminance = numpy.random.randn(1000, 1000)
# "Luminance" should range from 0.0...1.0 so we normalize it
luminance = (luminance - luminance.min())/(luminance.max() - luminance.min())
# get the histogram values
heights,edges = np.histogram(luminance.flat, bins=30)
binCenters = (edges[:-1] + edges[1:])/2
# norm the heights
heights = heights/heights.sum()
# get the cdf
cdf = heights.cumsum()
left = plt.subplot(121)
left.imshow(luminance)
right = plt.subplot(122)
right.plot(binCenters, cdf, binCenters, heights)
# plt.savefig("stackoverflow.pdf", dpi=300)
plt.tight_layout(rect=(0, 0, 1, 0.95))
plt.show()
# confirm that the hist vals sum to 1
print('heights sum: %.2f' % heights.sum())产出:

heights sum: 1.00实际答案
这个其实是超级容易。就这么做
sns.distplot(luminance.flatten(), kde_kws={"cumulative": True}, norm_hist=True)下面是在运行您的脚本时所得到的内容,并进行了上述修改:

令人惊讶的扭曲!
因此,按照形式标识,您的直方图始终是规范化的:

在普通(呃)英语中,一般的做法是用其密度来规范连续值直方图(即它们的观测值可以用浮点数表示)。因此,在本例中,bin宽度乘以bin高度之和将达到1.0,通过运行这个简化的脚本版本可以看到:
import matplotlib.pyplot as plt
import matplotlib.ticker as tck
import numpy as np
imagen2 = plt.figure(1, figsize=(4,3))
imagen2.suptitle('StackOverflow Matplotlib histogram demo')
luminance = numpy.random.randn(1000, 1000)
luminance = (luminance - luminance.min())/(luminance.max() - luminance.min())
heights,edges,patches = plt.hist(luminance.ravel(), density=True, bins=30)
widths = edges[1:] - edges[:-1]
totalWeight = (heights*widths).sum()
# plt.savefig("stackoverflow.pdf", dpi=300)
plt.tight_layout(rect=(0, 0, 1, 0.95))
plt.show()
print(totalWeight)而totalWeight确实与1.0完全相等,给出或取一点舍入误差。
https://stackoverflow.com/questions/49781927
复制相似问题