文章/答案/技术大牛

发布

社区首页 >问答首页 >如何与每个因素进行比较( 1，2，3因素的组合)

问如何与每个因素进行比较( 1，2，3因素的组合)
EN

Stack Overflow用户

提问于 2020-10-02 08:51:46

回答 1查看 28关注 0票数 0

你能帮我自动计算我们客户在1,2,3种因素中所占份额的过程吗？

我有一个客户和特性的数据集。所有客户都有标签：

1 -“我们的”
0-“not_ours”

clients ours car        house         boat     plane         bike
client1 1     1         0             1         1             1
client2 0     0         0             1         1             1
client3 1     0         0             0         1             1
client4 1     1         0             1         1             1
client5 0     0         0             1         1             1
client6 1     0         0             0         1             1
clientN 0     0         1             0         1             1

我想做三个实验：

，以了解我们在每个1因子值内的数量份额。理想结果：

factor_value 1 1 0 0计算我们所占份额(%)我们所占份额(%)汽车2 100% 2 40%房屋0 0% 467%船2 50% 2 67%飞机4 67% 0 0%自行车4 67% 0 0%

其中我们的份额=我们在要素价值中的份额。例如，汽车价值= 0。比我们的客户高出40%，因为5位客户没有车，其中有2位是我们的客户。

相同的计算，但检查每个因素的组合两个因素：

车+房车+船车+飞机车+自行车房+船屋+飞机屋+自行车船+飞机艇+平船+自行车

考虑了三个因素的所有可能组合：

车+船车+船车+飞机车+飞机车+车+自行车车+船+飞机车+船+车+飞机+飞机+自行车

python

python-3.x

回答 1

Stack Overflow用户

发布于 2020-10-06 07:24:22

以下是几个步骤(对三个因素进行分析)：

#create container for 3 factors combinations
xgb3 = pd.DataFrame([('i', 'j', 'k')], columns = ['factor1', 'factor2', 'factor3'], index=[0])

#take the previous  table with combinations of 2 factors: (res3)
for i in range(len(res3)) :
    a = res3.iloc[i:i + 1,:]['factor1'].values[0]
    b = res3.iloc[i:i + 1,:]['factor2'].values[0]
#add the third factor
    for j in df.iloc[:,5:].columns.values:
#sort - to drop duplicates (e.g. a,c,b and a,b,c)
        to_sort = sorted([a, b, j])
        new_row3 = {'factor1':to_sort[0], 'factor2':to_sort[1], 'factor3':to_sort[2]}
        xgb3 = xgb3.append(new_row3, ignore_index=True)
        
xgb3 = xgb3.drop_duplicates()
#additional drop of duplicates inside the row (e.g. a,a,b). All the items in the row must be unique
xgb3 = xgb3[(xgb3['factor1'] != xgb3['factor3']) & 
            (xgb3['factor2'] != xgb3['factor3']) & 
            (xgb3['factor1'] != xgb3['factor2'])].reset_index(drop=True)


#create container for results
result3 = pd.DataFrame([('','','',0,0,.1,0,0,.1)], columns = ['factor1','factor2','factor3','quantity_inside_combination', 'quantity_of_ours_inside_combination', '%_of_ours_inside_combination', 'quantity_outside_combination', 'quantity_of_ours_outside_combination', '%_of_ours_outside_combination'], index=[0])

for i in range(1, len(xgb3)): #range begins with 1 to skip the first row with technical information
    f = xgb3.iloc[i:i+1,:]['factor1'].values[0]
    s = xgb3.iloc[i:i+1,:]['factor2'].values[0]
    x = xgb3.iloc[i:i+1,:]['factor3'].values[0]
    
    m = df['stl'][(df[f] == 1) & (df[s] == 1) & (df[x] == 1)].count()
    n = df['stl'][(df[f] == 1) & (df[s] == 1) & (df[x] == 1)].sum()
    o = df['stl'][(df[f] == 0) & (df[s] == 0) & (df[x] == 0)].count()
    p = df['stl'][(df[f] == 0) & (df[s] == 0) & (df[x] == 0)].sum()
    
    new_row3 = {'factor1':f, 
               'factor2':s,
                'factor3': x,
               'quantity_inside_combination': m,
               'quantity_of_ours_inside_combination': n,
               '%_of_ours_inside_combination': 1. * n / m,
               'quantity_outside_combination': o,
               'quantity_of_ours_outside_combination': p,
               '%_of_ours_outside_combination': 1. * p / o
              }
    result3 = result3.append(new_row3, ignore_index=True)

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/64168718

复制

相似问题

问如何与每个因素进行比较( 1，2，3因素的组合)
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何与每个因素进行比较( 1，2，3因素的组合)EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何与每个因素进行比较( 1，2，3因素的组合)
EN