文章/答案/技术大牛

发布

社区首页 >问答首页 >在Python的另一列中，如何识别哪个ID的值随时间的增加而增加？

问在Python的另一列中，如何识别哪个ID的值随时间的增加而增加？
EN

Stack Overflow用户

提问于 2020-08-13 19:30:02

回答 2查看 124关注 0票数 0

假设我有一个有3列的数据框架：

| id | value |    date   |
+====+=======+===========+
|  1 |   50  |  1-Feb-19 |
+----+-------+-----------+
|  1 |  100  |  5-Feb-19 |
+----+-------+-----------+
|  1 |  200  |  6-Jun-19 |
+----+-------+-----------+
|  1 |  500  |  1-Dec-19 |
+----+-------+-----------+
|  2 |   10  |  6-Jul-19 |
+----+-------+-----------+
|  3 |  500  |  1-Mar-19 |
+----+-------+-----------+
|  3 |  200  |  5-Apr-19 |
+----+-------+-----------+
|  3 |  100  | 30-Jun-19 |
+----+-------+-----------+
|  3 |   10  | 25-Dec-19 |
+----+-------+-----------+

ID列包含特定人员的ID。“值”列包含其事务处理的值。Date列包含其事务处理的日期。

在Python中是否有一种方法可以将ID 1识别为ID，并且随着时间的推移事务值的增加？

我正在寻找一些方法，我可以提取ID 1作为我想要的ID随着事务的价值增加，过滤掉ID 2，因为它没有足够的事务来分析一个趋势，也过滤掉ID 3，因为它的事务趋势是随着时间的推移而下降的。

python

pandas

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-08-13 20:28:46

df['new'] = df.groupby(['id'])['value'].transform(lambda x : \
                      np.where(x.diff()>0,'incresase',
                      np.where(x.diff()<0,'decrease','--')))

df = df.groupby('id').new.agg(['last'])
df

输出：

      last
id  
1   increase
2   --
3   decrease

只增加身份证：

increasingList = df[(df['last']=='increase')].index.values
print(increasingList)

结果：

[1]

假设这不会发生

1  50
1  100
1  50

如果是，那么：

df['new'] = df.groupby(['id'])['value'].transform(lambda x : \
                      np.where(x.diff()>0,'increase',
                      np.where(x.diff()<0,'decrease','--')))
df

输出：

    value   new
id      
1   50  --
1   100 increase
1   200 increase
2   10  --
3   500 --
3   300 decrease
3   100 decrease

凹字串：

df = df.groupby(['id'])['new'].apply(lambda x: ','.join(x)).reset_index()
df

中间结果：

    id  new
0   1   --,increase,increase
1   2   --
2   3   --,decrease,decrease

检查是否行/只存在减少“-”存在。丢下它们

df = df.drop(df[df['new'].str.contains("dec")].index.values)
df = df.drop(df[(df['new']=='--')].index.values)
df

结果：

    id  new
0   1   --,increase,increase

票数 1

Stack Overflow用户

发布于 2020-08-13 20:31:40

可能按id分组，并检查排序的值是否相同，无论是按值排序还是按日期排序：

>>> df.groupby('id').apply( lambda x:
...    (
...        x.sort_values('value', ignore_index=True)['value'] == x.sort_values('date', ignore_index=True)['value']
...    ).all()
... )
id
1     True
2     True
3    False
dtype: bool

编辑：

为了使id=2不成真，我们可以这样做：

>>> df.groupby('id').apply( lambda x:
...    (
...        (x.sort_values('value', ignore_index=True)['value'] == x.sort_values('date', ignore_index=True)['value'])
...        & (len(x) > 1)
...    ).all()
... )
id
1     True
2    False
3    False
dtype: bool

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/63402045

复制

相似问题

问在Python的另一列中，如何识别哪个ID的值随时间的增加而增加？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在Python的另一列中，如何识别哪个ID的值随时间的增加而增加？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在Python的另一列中，如何识别哪个ID的值随时间的增加而增加？
EN