我有一个与下面类似的数据框架(但它有数百只股票,而不是A和B)。我也不知道数据框中会有多少股票。我正在尝试通过按日期列匹配的所有股票对指数行进行分红(2020年5月15日的股票A按2020年5月15日的指数分红,2020年5月16日的股票A除以2020年5月16日的指数,等等,然后是2020年5月15日的股票B按2020年5月15日的指数分红,依此类推)。我在所需的列中添加了我想要的答案,但不知道如何获得它。
d = {'Stock' : pd.Series(['A', 'A', 'A','B', 'B', 'B', 'INDEX', 'INDEX', 'INDEX']),
'Date' : pd.Series(['5/15/2020', '5/16/2020', '5/17/2020','5/15/2020', \
'5/16/2020', '5/17/2020','5/15/2020','5/16/2020','5/17/2020']),
'Price' : pd.Series([10,20,30,20,40,60,2,5,10]),
'DESIRED' : pd.Series([5,4,3,10,8,6,1,1,1])}
df = pd.DataFrame(d)
df 发布于 2020-05-18 03:35:34
import pandas as pd
d = {'Stock' : pd.Series(['A', 'A', 'A','B', 'B', 'B', 'INDEX', 'INDEX', 'INDEX']),
'Date' : pd.Series(['5/15/2020', '5/16/2020', '5/17/2020','5/15/2020', \
'5/16/2020', '5/17/2020','5/15/2020','5/16/2020','5/17/2020']),
'Price' : pd.Series([10,20,30,20,40,60,2,5,10]),
'DESIRED' : pd.Series([5,4,3,10,8,6,1,1,1])}
df = pd.DataFrame(d) 这里有一个可能的解决方案:
#First we build a dataframe containing only index rows
df_index = df[df.Stock == 'INDEX']
#and we get rid of those rows from the original dataframe
df = df[df.Stock != 'INDEX']
#now we merge them
df = df.merge(df_index[['Date', 'Price']], on='Date', suffixes = ['', '_index'])
#and we simply create the new column
df['hooray!'] = df.Price/df.Price_index
#If you want you can delete the column
#del df['Price_index']输出:
Stock Date Price DESIRED Price_index hooray!
0 A 5/15/2020 10 5 2 5.0
1 B 5/15/2020 20 10 2 10.0
2 A 5/16/2020 20 4 5 4.0
3 B 5/16/2020 40 8 5 8.0
4 A 5/17/2020 30 3 10 3.0
5 B 5/17/2020 60 6 10 6.0发布于 2020-05-18 03:29:13
这应该能起到作用:
import pandas as pd
#data (NOTE: i've removed the desired column)
d = {'Stock' : pd.Series(['A', 'A', 'A','B', 'B', 'B', 'INDEX', 'INDEX', 'INDEX']),
'Date' : pd.Series(['5/15/2020', '5/16/2020', '5/17/2020','5/15/2020', \
'5/16/2020', '5/17/2020','5/15/2020','5/16/2020','5/17/2020']),
'Price' : pd.Series([10,20,30,20,40,60,2,5,10])}
#create dataframe
df = pd.DataFrame(d)
#create emoty desired column
df['DESIRED'] = ''
#create sub dataframes for stocks and indices
stocksDf = df.loc[df['Stock'] != 'INDEX'].reset_index(drop=True)
indexDf = df.loc[df['Stock'] == 'INDEX'].reset_index(drop=True)
#loop over stocks dataframe
for i, row in stocksDf.iterrows():
#define needed values
stocks = stocksDf.at[i, 'Stock']
price = stocksDf.at[i, 'Price']
date = stocksDf.at[i, 'Date']
#get index matching date of stock
matchingIndex = indexDf.loc[indexDf['Date'] == date].reset_index(drop=True)
#if doesn't exists just print no matching index
if len(matchingIndex)==0:
df['DESIRED'].loc[(df['Stock'] == stocks) & (df['Price'] == price) & (df['Date'] == date)] = 'No Matching Index'
else:
#if exists calculate Desired as Price of stock / price of index
indexPrice = matchingIndex.at[0,'Price']
df['DESIRED'].loc[(df['Stock'] == stocks) & (df['Price'] == price) & (df['Date'] == date)] = df['Price'] / indexPrice
#for indices just set desired as 1
df['DESIRED'].loc[df['Stock'] == 'INDEX'] = 1
print(df)
Stock Date Price DESIRED
0 A 5/15/2020 10 5
1 A 5/16/2020 20 4
2 A 5/17/2020 30 3
3 B 5/15/2020 20 10
4 B 5/16/2020 40 8
5 B 5/17/2020 60 6
6 INDEX 5/15/2020 2 1
7 INDEX 5/16/2020 5 1
8 INDEX 5/17/2020 10 1https://stackoverflow.com/questions/61857176
复制相似问题