首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >大熊猫在每一组中最长的最后一段

大熊猫在每一组中最长的最后一段
EN

Stack Overflow用户
提问于 2022-09-10 17:59:55
回答 2查看 26关注 0票数 2

我有以下潘达斯的数据:

代码语言:javascript
复制
import pandas as pd

    df = pd.DataFrame(
    
        [
            ("bird", '2022-01',"Falconiformes"),
            ("bird", '2022-02',"Falconiformes"),
            ("bird", '2022-03',"Falconiformes"),
            ("bird", '2022-04',"Falconiformes"),
            ("bird", '2022-05',"Falconiformes"),
            ("bird", '2022-06',"Falconiformes"),
            ("bird", '2022-07',"Falconiformes"),
            ("bird", '2022-08',"Falconiformes"),
            ("bird", '2022-09',"Psittaciformes"),
            ("bird", '2022-10',"Psittaciformes"),
            ("bird", '2022-11',"Psittaciformes"),
            ("bird", '2022-12',"Psittaciformes"),
            ("mammal", '2022-01',"Falconiformes"),
            ("mammal", '2022-02',"Falconiformes"),
            ("mammal",'2022-03',"Falconiformes"),
            ("mammal", '2022-04',"Falconiformes"),
            ("mammal",'2022-05',"Falconiformes"),
            ("mammal", '2022-06',"Psittaciformes"),
            ("mammal", '2022-07',"Falconiformes"),
            ("mammal", '2022-08',"Falconiformes"),
            ("mammal", '2022-09',"Falconiformes"),
            ("mammal", '2022-10',"Falconiformes"),
            ("mammal", '2022-11',"Falconiformes"),
            ("mammal", '2022-12',"Falconiformes"),
        
    
        ],
    
        columns=("animal", "date", "attribute"),
    )

现在事情变得越来越复杂了。对于每一种动物,我想要该组中最新的连续值序列的计数。

结果应该是

代码语言:javascript
复制
result = pd.DataFrame(
    [   ("bird", 'Psittaciformes' ,4),
        ("mammal", 'Falconiformes' ,6),
    ],
    columns=("animal", "attribute", "count"),
)
result

我认为可以用迭代组或类似的方法来编程。我要找的是个独角兽。这应该是可能的,是吗?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-09-10 18:16:23

可以使用groupby.agg和自定义函数计算count

代码语言:javascript
复制
(df.groupby('animal', as_index=False)
   .agg(attribute=('attribute', 'last'),
        count=('attribute', lambda s: s.eq(s.iloc[-1])[::-1].cummin().sum())
       )
)

产出:

代码语言:javascript
复制
   animal       attribute  count
0    bird  Psittaciformes      4
1  mammal   Falconiformes      6

功能:

代码语言:javascript
复制
s.eq(s.iloc[-1])   # identify values equal to last one
[::-1]             # inverse Series
.cummin()          # set all values False after the first False
.sum()             # count the True
票数 2
EN

Stack Overflow用户

发布于 2022-09-10 19:32:17

另一种解决办法是:

代码语言:javascript
复制
df_out = df.groupby("animal", as_index=False).apply(
    lambda x: x.groupby((x.attribute != x.attribute.shift()).cumsum())
    .agg(
        animal=("animal", "first"),
        attribute=("attribute", "first"),
        count=("animal", "count"),
    )
    .iloc[-1]
)

print(df_out)

指纹:

代码语言:javascript
复制
   animal       attribute  count
0    bird  Psittaciformes      4
1  mammal   Falconiformes      6
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73674096

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档