文章/答案/技术大牛

发布

问熊猫DataFrame：.groupby()
EN

Stack Overflow用户

提问于 2022-05-31 13:49:07

回答 1查看 26关注 0票数 0

我在看NFT的回报。我有一个数据集，其中包含特定ID的重复事务：

df_new = RSR.reset_index(drop=True)
print(df_new.head())

Here is the output:
 
   Asset ID     Collection        Date  Transaction price (USD)
0  10302582           axie  29/01/2020                   3.1159
1  10302582           axie  29/01/2020                   2.4535
2  10406110  cryptokitties  07/01/2020                   1.4192
3  10406110  cryptokitties  22/01/2020                   0.8415
4  10424431           axie  02/01/2020                   1.5289
...

单个ID的事务数介于2到n之间。

我正试图将输出显示在所附图片中：链接到理想输出。

基本上，我每个已完成的事务都有一行，这样我就可以计算返回。

当只有两个事务时，我正设法达到一个非常类似的输出。

c = df_new["Asset ID"]
RSR_clean = df_new.set_index([c, df_new.groupby(c).cumcount() + 1]).unstack().sort_index(1, 1)

The output is:

        Asset ID             Collection        Date Transaction price (USD)  \
                1                      1           1                       1    
Asset ID                                                                        
10302582  10302582                   axie  29/01/2020                  3.1159   
10406110  10406110          cryptokitties  07/01/2020                  1.4192   
10424431  10424431                   axie  02/01/2020                  1.5289   
1060112,  1060112,          cryptokitties  02/01/2020                 15.6885   
1092364,  1092364,                   axie  14/01/2020                165.9554   
...            ...                    ...         ...                     ...   
919066,   919066,           cryptokitties  10/01/2020                  1.3781   
9533256,  9533256,  cryptovoxel-wearables  21/01/2020                  0.8485   
971380,   971380,           cryptokitties  09/01/2020                 20.8469   
987084,   987084,           cryptokitties  03/01/2020                 16.1089   
992882,   992882,           cryptokitties  02/01/2020                 15.0981   

          Asset ID             Collection        Date Transaction price (USD)  \
                2                      2           2                       2    
Asset ID                                                                        
10302582  10302582                   axie  29/01/2020                  2.4535   
10406110  10406110          cryptokitties  22/01/2020                  0.8415   
10424431  10424431                   axie  14/01/2020                  3.1532   
1060112,  1060112,          cryptokitties  07/01/2020                 27.5083   
1092364,  1092364,                   axie  14/01/2020                165.9554

注:实际上，这是每个资产ID的列集合。

然而，当每项资产有超过2笔交易时，我无法找到一种方法。使用我的当前代码，它们只是作为新列添加。

我想要实现的是，每次都有第三、第四、第五等事务成为新的行。在这些新行中，第3和第4列应该是前面的事务信息。

你知道我怎样才能实现这个布局吗？非常感谢！

pandas

dataframe

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-05-31 15:52:03

如果我理解这个问题，那么您有一个事务的DataFrame，其中每个资产( "Asset“列)都有两行或更多行数据。您希望按时间顺序将所有与单个资产关联的后续数据行连接起来，以查找事务的返回。

我认为您需要创建一个“事务ID”列，它将从1到n-1，然后您将自动加入到该列。使用您的代码，类似这样的东西应该可以工作。

df = RSR.reset_index(drop=True)

# Sort the data read for processing
df.sort_values(["Asset ID", "Date"], inplace=True)

# Add a Transaction ID column for the purchase
df["Transaction ID"] = df.groupby(["Asset ID"])["Transaction ID"].cumcount()

# Create a copy of the DataFrame for joining
df_sell = df.copy()

# Bump the Transaction ID for the sale
df_sell["Transaction ID"] = df_sell["Transaction ID"] + 1

# Join the Two DataFrames
df = df.merge(df_sell, on=["Asset ID", "Transaction ID"], suffixes=(" Buy", " Sell"))

这应该输出一个与您预期类似的表，但是它有1和2，但是所有的列都有一个"Buy“或"Sell”后缀。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/72449027

复制

相似问题

问熊猫DataFrame：.groupby()
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问熊猫DataFrame：.groupby()EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问熊猫DataFrame：.groupby()
EN