文章/答案/技术大牛

发布

社区首页 >问答首页 >如何使用skewnorm生成具有指定偏斜的分布？

问如何使用skewnorm生成具有指定偏斜的分布？
EN

Stack Overflow用户

提问于 2018-04-12 23:57:58

回答 2查看 3.3K关注 0票数 3

我试图产生一个随机分布，其中我控制均值，SD，偏度和峰度。

在产生分布后，我可以用一些简单的数学来求解平均值和SD。

Kurtosis我暂时把它搁置了，因为它似乎太难了。

偏斜是当今的问题。

import scipy.stats

def convert_to_alpha(s):
    d=(np.pi/2*((abs(s)**(2/3))/(abs(s)**(2/3)+((4-np.pi)/2)**(2/3))))**0.5 
    a=((d)/((1-d**2)**.5))
    return(a)

for skewness_expected in (.5, .9, 1.3):
    alpha = convert_to_alpha(skewness_expected)
    r = stats.skewnorm.rvs(alpha,size=10000)
    print('Skewness expected:',skewness_expected)
    print('Skewness obtained:',stats.skew(r))
    print()

Skewness expected: 0.5
Skewness obtained: 0.47851348006629035

Skewness expected: 0.9
Skewness obtained: 0.8917020428586827

Skewness expected: 1.3
Skewness obtained: (1.2794406116842627+0.01780402125888404j)

我知道计算出的偏度通常不会与期望的偏度匹配--毕竟这是一个随机分布。但我很困惑如何才能得到一个偏度>1的分布，而不会落入复数区域。rvs方法似乎无法处理它，因为只要skewness > 1，参数alpha就是一个虚数。

我如何修复它，以便我可以生成偏斜度> 1的分布，但不会有复杂的数字悄悄进入？

[感谢Warren Weckesser指导我在维基百科上编写convert_to_alpha函数。]

scipy

statistics

回答 2

Stack Overflow用户

发布于 2019-09-26 15:43:08

我知道这个帖子已经有一年半的历史了，但我最近也遇到了这个问题，在这里似乎从来没有得到过回答。在stats.skewnorm和偏斜统计之间转换的进一步问题是，这样做还会改变分布的中心趋势度量，这对我的需求来说是有问题的。

这是我基于F分布(https://en.wikipedia.org/wiki/F-distribution)开发的。大量工作的最终结果是此函数，您可以为其指定所需的平均值、SD和偏斜度以及所需的样本大小。如果有人愿意，我可以分享它背后的工作。在极端设置下，输出SD和偏斜会变得有点粗糙。大概是因为F分布自然地位于1附近。对于接近于零的偏斜值，这也是非常有问题的，在这种情况下，无论如何都不需要这个函数。

from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

def createSkewDist(mean, sd, skew, size):

    # calculate the degrees of freedom 1 required to obtain the specific skewness statistic, derived from simulations
    loglog_slope=-2.211897875506251 
    loglog_intercept=1.002555437670879 
    df2=500
    df1 = 10**(loglog_slope*np.log10(abs(skew)) + loglog_intercept)

    # sample from F distribution
    fsample = np.sort(stats.f(df1, df2).rvs(size=size))

    # adjust the variance by scaling the distance from each point to the distribution mean by a constant, derived from simulations
    k1_slope = 0.5670830069364579
    k1_intercept = -0.09239985798819927
    k2_slope = 0.5823114978219056
    k2_intercept = -0.11748300123471256

    scaling_slope = abs(skew)*k1_slope + k1_intercept
    scaling_intercept = abs(skew)*k2_slope + k2_intercept

    scale_factor = (sd - scaling_intercept)/scaling_slope    
    new_dist = (fsample - np.mean(fsample))*scale_factor + fsample

    # flip the distribution if specified skew is negative
    if skew < 0:
        new_dist = np.mean(new_dist) - new_dist

    # adjust the distribution mean to the specified value
    final_dist = new_dist + (mean - np.mean(new_dist))

    return final_dist




'''EXAMPLE'''
desired_mean = 497.68
desired_skew = -1.75
desired_sd = 77.24

final_dist = createSkewDist(mean=desired_mean, sd=desired_sd, skew=desired_skew, size=1000000)

# inspect the plots & moments, try random sample
fig, ax = plt.subplots(figsize=(12,7))
sns.distplot(final_dist, hist=True, ax=ax, color='green', label='generated distribution')
sns.distplot(np.random.choice(final_dist, size=100), hist=True, ax=ax, color='red', hist_kws={'alpha':.2}, label='sample n=100')
ax.legend()

print('Input mean: ', desired_mean)
print('Result mean: ', np.mean(final_dist),'\n')

print('Input SD: ', desired_sd)
print('Result SD: ', np.std(final_dist),'\n')

print('Input skew: ', desired_skew)
print('Result skew: ', stats.skew(final_dist))

输入均值: 497.68

 Result mean:  497.6799999999999

输入SD: 77.24

 Result SD:  71.69030764848961

输入偏差：-1.75

 Result skew:  -1.6724486459469905

票数 4

Stack Overflow用户

发布于 2018-04-13 00:24:28

斜正态分布的形状参数不是分布的偏斜度。查看wikipedia page for the skew normal distribution。右表中的公式根据参数给出了均值、方差、偏度等的表达式。您可以使用stats()方法从skewnorm对象获取这些值。

例如，下面是形状参数为2的分布的偏斜度：

In [46]: from scipy.stats import skewnorm, skew

In [47]: skewnorm.stats(2, moments='s')
Out[47]: array(0.45382556395938217)

生成几个样本，找出样本的偏斜度：

In [48]: r = skewnorm.rvs(2, size=10000000)

In [49]: skew(r)
Out[49]: 0.4533209955299838

In [50]: r = skewnorm.rvs(2, size=10000000)

In [51]: skew(r)
Out[51]: 0.4536583726840712

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/49801071

复制

相似问题

问如何使用skewnorm生成具有指定偏斜的分布？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用skewnorm生成具有指定偏斜的分布？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用skewnorm生成具有指定偏斜的分布？
EN