文章/答案/技术大牛

发布

社区首页 >问答首页 >函数将异常值替换为Python中的下限和上限

问函数将异常值替换为Python中的下限和上限
EN

Stack Overflow用户

提问于 2018-09-14 02:00:47

回答 1查看 2.1K关注 0票数 0

from sklearn import datasets
import pandas as pd
import numpy as np

dt = datasets.load_diabetes()
data = pd.DataFrame(data= np.c_[dt['data'], dt['target']],columns= 
dt['feature_names'] + ['target'] )
data = data.drop('sex', axis = 1)

# mean +- 2sigma
# function to calculate outlier of a variable
def out1(x):
    mu = np.average(x)
    sigma = np.std(x)
    LL = mu - 2*sigma # Lower limit 
    UL = mu + 2*sigma # Upper limit
    out = [1 if (a >= UL) | (a <= LL) else 0 for a in x]
    return(out)

# check #outliers in each variable
print(data.apply(out1).apply(sum))


# Function to Replace outlier with LL / UL

def out_impute(x):
    mu = np.average(x)
    sigma = np.std(x)
    LL = mu - 2*sigma # Lower limit 
    UL = mu + 2*sigma # Upper limit
    xnew = "Enter Code Here"
    return(xnew)

data1 = data.apply(out_impute) # Create new data with inputed values

请有人帮助我如何用下限和上限来替换离群点。

我将异常值定义为值>= mu +2*西格玛和=< mu-2*西格玛。我在代码'out_impute‘中定义了一个函数，但是我在替换部分陷入了困境。

提前感谢！

statistics

python

pandas

machine-learning

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-09-14 02:04:25

使用df.clip

LL = mu - 2*sigma # Lower limit 
UL = mu + 2*sigma # Upper limit
df['data'].clip(LL, UL)

票数 4

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/52324026

复制

相似问题

问函数将异常值替换为Python中的下限和上限
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问函数将异常值替换为Python中的下限和上限EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问函数将异常值替换为Python中的下限和上限
EN