我正在尝试用python制作柱状图,使用name,prop,total。我的想法是,我应该有名字,然后如果我可以显示总流和多大比例的男性。
我有以下示例数据
NAME prop_male total
GGD 0.254147 727240
CCG 0.216658 323510
PPT 0.265414 251023
MMMA 0.185105 210416
JKK 0.434557 201594
BBD 0.279319 198998
KNL. 0.277761 190246
TSK 0.277653 171030
LIS 0.218444 165168
BRK 0.44755 161124我试过了,但不知何故我错过了这个把戏
进口熊猫作为pd进口海运作为sns
x, y, hue = "name", "proportion", "total"
(df[x]
.groupby(df[hue])
.value_counts(normalize=True)
.rename(y)
.reset_index()
.pipe((sns.barplot, "data"), x=x, y=y, hue=hue))有没有人可以建议/帮助我画一个有意义的图,这样我就可以同时显示所有3个信息。
提前感谢
发布于 2021-01-11 22:47:50
实现这一目标的各种方法。一种方法是计算雄性的数量,并将这些柱子放在一起:
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
df = pd.DataFrame({"name": list("ABC"), "proportion": [0.2, 0.7, 0.1], "total": [123, 321, 213]})
df["male"] = df.proportion * df.total
ax = sns.barplot(data=df, x="name", y="total", color="lightblue")
sns.barplot(data=df, x="name", y="male", color="blue", ax=ax)
ax.set_ylabel("male/total")
plt.show()示例输出:

seaborn中的hue参数通常是long-form data中的分类类别。为了说明这一点,这里有一个示例代码:
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
df = pd.DataFrame({"name": list("ABC"), "proportion": [0.2, 0.7, 0.1], "total": [123, 321, 213]})
df["male"] = df.proportion * df.total
#transform the data from wide to long form
df_plot = df.melt(id_vars=["name"], value_vars=["male", "total"])
#use the former column names as categories in a barplot
sns.barplot(data=df_plot, x="name", y="value", hue="variable")
plt.show()输出:

您也可以决定将百分比和总数分开显示:
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
df = pd.DataFrame({"name": list("ABC"), "proportion": [0.2, 0.7, 0.1], "total": [123, 321, 213]})
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
sns.barplot(data=df, x="name", y="total", color="lightblue", ax=ax1)
sns.lineplot(data=df, x="name", y= "proportion", color="black", lw=3, ls="--", ax=ax2)
plt.show()示例输出:

我有没有提到过有不止一种方法?
发布于 2021-01-11 23:11:09
绘制这些信息的方法有无数种,但是如果您想要在条形图(可见的条形图)中汇总这些信息,则列的比例会有很大的不同。
最好的方法可能是T先生的建议,而且情节看起来真的很好(不过我想加一个图例来解释,深蓝色条是男性计数,而浅蓝色条是总数)。
为了完整起见,我将报告另外两个选项,这两个选项给出的结果更难解释():
您可以缩放"total“列以使其可见,您可以绘制散点图
import matplotlib.pyplot as plt
import matplotlib
import numpy as np
Name = ['GGD', 'CCG', 'PPT', 'MMMA', 'JKK', 'BBD', 'KNL']
prop_male = [0.254147, 0.216658, 0.265414, 0.185105, 0.434557, 0.279319,
0.277761]
total = [727240, 323510, 251023, 210416, 201594, 198998, 190246]
#Plot as bar
x = np.arange(len(Name)) # the label locations
width = 0.35 # the width of the bars
fig, ax = plt.subplots(1,2, figsize=(20,8))
rects1 = ax[0].bar(x - width/2, [float(i)/1e6 for i in total], width,
label=r'Total $\times$ 1e-6 ')
rects2 = ax[0].bar(x + width/2, prop_male, width, label='Prop_male')
ax[0].set_xticks(x)
ax[0].set_xticklabels(Name, size=15)
ax[0].legend()
ax[0].set_ylabel("Counts [a.u.]", size=15)
#plot as scatter
norm = matplotlib.colors.Normalize(vmin=0,vmax=len(Name))
mapper = matplotlib.cm.ScalarMappable(norm=norm, cmap='viridis')
colors = np.array([(mapper.to_rgba(v)) for v in range(len(Name))])
for x, y, c in zip(prop_male, total, colors):
ax[1].plot(x, y, 'o', color=c, markersize=10, alpha=0.8)
cmap = plt.get_cmap('viridis',len(Name))
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
sm.set_array([])
cbar = plt.colorbar(sm, ticks=np.linspace(0,len(Name),len(Name)))
cbar.ax.set_yticklabels(Name)
cbar.set_label('Name', size=15)
ax[1].set_xlabel("prop_male", size=15)
ax[1].set_ylabel("total", size=15)情节应该是这样的

https://stackoverflow.com/questions/65668572
复制相似问题