文章/答案/技术大牛

发布

社区首页 >问答首页 >如何通过在dataframe中限制行大小来找到列和？

问如何通过在dataframe中限制行大小来找到列和？
EN

Stack Overflow用户

提问于 2021-04-10 10:06:43

回答 2查看 93关注 0票数 0

具有数据帧df1：

         DP 1     DP 2    DP 3   DP 4     DP 5    DP 6    DP 7   DP 8    DP 9    DP 10
OP 1    357848  1124788 1735330 2218270 2745596 3319994 3466336 3606286 3833515 3901463
OP 2    352118  1236139 2170033 3353322 3799067 4120063 4647867 4914039 5339085 
OP 3    290507  1292306 2218525 3235179 3985995 4132918 4628910 4909315     
OP 4    310608  1418858 2195047 3757447 4029929 4381982 4588268         
OP 5    443160  1136350 2128333 2897821 3402672 3873311             
OP 6    396132  1333217 2180715 2985752 3691712                 
OP 7    440832  1288463 2419861 3483130                     
OP 8    359480  1421128 2864498                         
OP 9    376686  1363294                             
OP 10   344014

我想通过限制行号来计算每一列的和。

To calculate sum of first column data, Sum(DP1) where row size should be 10-1

To calculate sum of second column data, Sum(DP2) where row size should be 10-2

To calculate sum of Third column data, Sum(DP3) where row size should be 10-3

以此类推。

输出如下：

    3327371  10251249  15047844  18447791  17963259  15954957  12743113  8520325  3833515

我试着使用for循环：

>>dataframe_len = len(df1.columns)
>>print(dataframe_len)
   10
>>for i in range(0,10):
     #Here i need to find the sum of each column 
     #sum('col')(row size is 10-i)

这不是关于DP1 to DP10(10列)，那里有太多的列。

谢谢你抽出时间:)

python

pandas

dataframe

triangle

回答 2

Stack Overflow用户

发布于 2021-04-10 11:22:38

假设您希望按照预期的输出(而不是根据您的描述)，在删除NA值然后跳过最后一个值之后，每一列都要使用sum()：

df.apply(lambda col: col.dropna()[:-1].sum())

输出：

DP 1      3327371.0
DP 2     10251249.0
DP 3     15047844.0
DP 4     18447791.0
DP 5     17963259.0
DP 6     15954957.0
DP 7     12743113.0
DP 8      8520325.0
DP 9      3833515.0
DP 10           0.0

备注:您的总和不是10-1、10-2、10-3等行，而是9-1、8-1、7-1行。即。跳过每个列的最后一个非NA值，而不是跳过最上面的行。

Ex df['DP 1'].sum()是3671385，但跳过最后一行的df['DP 1'][:-1].sum()是3327371，它与预期的输出匹配。对于DP2：df['DP 2'].sum()是11614543，df['DP 2'].dropna()[:-1].sum()是10251249 (您期望的val)，df['DP 2'][2:10].sum()是9253616。

票数 1

Stack Overflow用户

发布于 2021-04-10 10:24:31

我认为您可以在使用apply()时使用列名中的信息。

def sum_row(col):
    t = int(col.name.split(' ')[-1])
    return col.iloc[:-t].sum()

df_ = df.apply(sum_row)

# print(df_)

DP 1      3327371.0
DP 2     10251249.0
DP 3     15047844.0
DP 4     18447791.0
DP 5     17963259.0
DP 6     15954957.0
DP 7     12743113.0
DP 8      8520325.0
DP 9      3833515.0
DP 10           0.0
dtype: float64

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/67033094

复制

相似问题

问如何通过在dataframe中限制行大小来找到列和？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何通过在dataframe中限制行大小来找到列和？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何通过在dataframe中限制行大小来找到列和？
EN