文章/答案/技术大牛

发布

社区首页 >问答首页 >在pandas数据框中按单组值划分多行

问在pandas数据框中按单组值划分多行
EN

Stack Overflow用户

提问于 2020-05-18 02:57:55

回答 2查看 82关注 0票数 0

我有一个与下面类似的数据框架(但它有数百只股票，而不是A和B)。我也不知道数据框中会有多少股票。我正在尝试通过按日期列匹配的所有股票对指数行进行分红(2020年5月15日的股票A按2020年5月15日的指数分红，2020年5月16日的股票A除以2020年5月16日的指数，等等，然后是2020年5月15日的股票B按2020年5月15日的指数分红，依此类推)。我在所需的列中添加了我想要的答案，但不知道如何获得它。

d = {'Stock' : pd.Series(['A', 'A', 'A','B', 'B', 'B', 'INDEX', 'INDEX', 'INDEX']), 
 'Date' : pd.Series(['5/15/2020', '5/16/2020', '5/17/2020','5/15/2020', \
                     '5/16/2020', '5/17/2020','5/15/2020','5/16/2020','5/17/2020']),
 'Price' : pd.Series([10,20,30,20,40,60,2,5,10]),
'DESIRED' : pd.Series([5,4,3,10,8,6,1,1,1])}
df = pd.DataFrame(d) 
df

python

pandas

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-05-18 03:35:34

import pandas as pd

d = {'Stock' : pd.Series(['A', 'A', 'A','B', 'B', 'B', 'INDEX', 'INDEX', 'INDEX']), 
 'Date' : pd.Series(['5/15/2020', '5/16/2020', '5/17/2020','5/15/2020', \
                     '5/16/2020', '5/17/2020','5/15/2020','5/16/2020','5/17/2020']),
 'Price' : pd.Series([10,20,30,20,40,60,2,5,10]),
'DESIRED' : pd.Series([5,4,3,10,8,6,1,1,1])}
df = pd.DataFrame(d)

这里有一个可能的解决方案：

#First we build a dataframe containing only index rows
df_index = df[df.Stock == 'INDEX']

#and we get rid of those rows from the original dataframe
df = df[df.Stock != 'INDEX']

#now we merge them
df = df.merge(df_index[['Date', 'Price']], on='Date', suffixes = ['', '_index'])

#and we simply create the new column
df['hooray!'] = df.Price/df.Price_index

#If you want you can delete the column
#del df['Price_index']

输出：

    Stock   Date    Price   DESIRED     Price_index     hooray!
0   A   5/15/2020   10  5   2   5.0
1   B   5/15/2020   20  10  2   10.0
2   A   5/16/2020   20  4   5   4.0
3   B   5/16/2020   40  8   5   8.0
4   A   5/17/2020   30  3   10  3.0
5   B   5/17/2020   60  6   10  6.0

票数 0

Stack Overflow用户

发布于 2020-05-18 03:29:13

这应该能起到作用：

import pandas as pd

#data (NOTE: i've removed the desired column)
d = {'Stock' : pd.Series(['A', 'A', 'A','B', 'B', 'B', 'INDEX', 'INDEX', 'INDEX']), 
'Date' : pd.Series(['5/15/2020', '5/16/2020', '5/17/2020','5/15/2020', \
                 '5/16/2020', '5/17/2020','5/15/2020','5/16/2020','5/17/2020']),
'Price' : pd.Series([10,20,30,20,40,60,2,5,10])}

#create dataframe
df = pd.DataFrame(d)
#create emoty desired column
df['DESIRED'] = ''

#create sub dataframes for stocks and indices
stocksDf = df.loc[df['Stock'] != 'INDEX'].reset_index(drop=True)
indexDf = df.loc[df['Stock'] == 'INDEX'].reset_index(drop=True)
#loop over stocks dataframe
for i, row in stocksDf.iterrows():
    #define needed values
    stocks = stocksDf.at[i, 'Stock']
    price = stocksDf.at[i, 'Price']
    date = stocksDf.at[i, 'Date']
    #get index matching date of stock
    matchingIndex = indexDf.loc[indexDf['Date'] == date].reset_index(drop=True)
    #if doesn't exists just print no matching index
    if len(matchingIndex)==0:
        df['DESIRED'].loc[(df['Stock'] == stocks) & (df['Price'] == price) & (df['Date'] == date)] = 'No Matching Index'
    else:
        #if exists calculate Desired as Price of stock / price of index
        indexPrice = matchingIndex.at[0,'Price']
        df['DESIRED'].loc[(df['Stock'] == stocks) & (df['Price'] == price) & (df['Date'] == date)] = df['Price'] / indexPrice

#for indices just set desired as 1
df['DESIRED'].loc[df['Stock'] == 'INDEX'] =   1 

print(df)

   Stock       Date  Price DESIRED
0      A  5/15/2020     10       5
1      A  5/16/2020     20       4
2      A  5/17/2020     30       3
3      B  5/15/2020     20      10
4      B  5/16/2020     40       8
5      B  5/17/2020     60       6
6  INDEX  5/15/2020      2       1
7  INDEX  5/16/2020      5       1
8  INDEX  5/17/2020     10       1

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/61857176

复制

相似问题

问在pandas数据框中按单组值划分多行
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在pandas数据框中按单组值划分多行EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在pandas数据框中按单组值划分多行
EN