文章/答案/技术大牛

发布

社区首页 >问答首页 >基于迭代从单个数据创建多个熊猫数据

问基于迭代从单个数据创建多个熊猫数据
EN

Stack Overflow用户

提问于 2018-07-03 06:01:27

回答 1查看 518关注 0票数 1

从单个dataframe(tr)中，我尝试基于一组列(Cat_col)创建多个数据格式。新的dataframe名称必须是tr_'colname‘。有人能帮我处理下面的代码吗？

for col in cat_col:
    tr_ = tr[[col,'TARGET']].groupby([col,'TARGET']).size().reset_index(name='Counts')
    tr_ = pivot_table(tr_,values='Counts',index=[col],columns=['TARGET'])
    print tr_.shape

输出：(3，2) (7，2) (8，2) (5，2) (6，2) (6，2) (18，2) (7，2) (58，2) (4，2) (3，2) (7，2)

tr[['col1','TARGET']].head(10)

col1目标0无人陪伴1 1家庭0 2无人陪伴0 3无人陪伴0 4无人陪伴0 5配偶伴侣0 6无人陪伴0 7无人陪伴0 8儿童0 9无人陪伴0

tr_col1.head(3)

目标0 1 col1

家庭37140 3009配偶伴侣10475 895无人陪伴228189 20337

python-2.7

pandas

iteration

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-07-03 06:40:46

我认为需要：

tr = pd.DataFrame({'A':list('abcdefabcd'),
                   'B':list('abcdeabffe'),
                   'TARGET':[1,1,0,0,1,0,1,1,0,1]})

print (tr)
   A  B  TARGET
0  a  a       1
1  b  b       1
2  c  c       0
3  d  d       0
4  e  e       1
5  f  a       0
6  a  b       1
7  b  f       1
8  c  f       0
9  d  e       1

cat_col = ['A','B']

d = {}
for col in cat_col:
    tr_ = (tr[[col,'TARGET']].groupby([col,'TARGET'])
                            .size()
                            .unstack()
                            .reset_index()
                            .rename_axis(None, axis=1))
    #some another processes if necessary

    #check if outout is DataFrame  
    print (type(tr_))

    print (tr_)
    #if necessary store to dict
    d[col] = tr_

#select df from dict
print (d['A'])
   A    0    1
0  a  NaN  2.0
1  b  NaN  2.0
2  c  2.0  NaN
3  d  1.0  1.0
4  e  NaN  1.0
5  f  1.0  NaN

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51147556

复制

相似问题

问基于迭代从单个数据创建多个熊猫数据
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问基于迭代从单个数据创建多个熊猫数据EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问基于迭代从单个数据创建多个熊猫数据
EN