我每月都有市场和不同行业的回报。我想通过从行业回报中减去市场收益来计算行业超额收益。然后,我想计算所有行业的超额回报之间的相关性。
我有数据与9个系列(一个日期系列(每月)和8个月回报系列)。我想附加7个新的系列,由每个series3到9和series2的差异(即series10是series3 - series2,11系列是series3 -series2等等)。新系列的标签应与前缀为“超额”的原始系列相同。我能够一次完成一个系列,即df"series10"=df"series3"-df"series2“,但是如何使用"for”语句和对该系列的数字引用来完成它呢?另外,如何计算数据中两个系列的相关性。提前谢谢。
发布于 2015-12-17 17:18:10
提供避免循环的替代方案。
参见在axis=1上使用diff()
import pandas as pd
import numpy as np
rng = pd.date_range('2015-1-1',periods=12, freq='m')
data = pd.DataFrame(np.random.rand(12,8),index = rng)
data.index.name = 'month'
delta = data.diff(axis=1).iloc[:,1:]
delta.columns = ['Excess_' + str(col) for col in delta.columns]
data.join(delta)
0 1 2 3 4 5 6 7 Excess_1 Excess_2 Excess_3 Excess_4 Excess_5 Excess_6 Excess_7
month
2015-01-31 0.995529 0.528600 0.165824 0.903643 0.392386 0.997586 0.532741 0.465801 -0.466929 -0.362776 0.737819 -0.511257 0.605200 -0.464845 -0.066939
2015-02-28 0.105747 0.507735 0.264120 0.911261 0.961350 0.139388 0.756352 0.241203 0.401989 -0.243615 0.647140 0.050090 -0.821962 0.616964 -0.515149
2015-03-31 0.239546 0.537783 0.710753 0.317866 0.194260 0.774347 0.026830 0.652135 0.298237 0.172970 -0.392887 -0.123606 0.580087 -0.747517 0.625305
2015-04-30 0.453483 0.470196 0.340318 0.570760 0.163147 0.125921 0.074989 0.082275 0.016714 -0.129878 0.230442 -0.407613 -0.037226 -0.050933 0.007287
2015-05-31 0.099153 0.182511 0.676164 0.036362 0.026314 0.274792 0.961327 0.162986 0.083357 0.493653 -0.639801 -0.010049 0.248479 0.686534 -0.798341
2015-06-30 0.929498 0.401576 0.682311 0.831759 0.338765 0.147514 0.208116 0.358427 -0.527922 0.280735 0.149448 -0.492994 -0.191251 0.060603 0.150311
2015-07-31 0.030018 0.320987 0.031405 0.248800 0.988799 0.202371 0.882598 0.384514 0.290969 -0.289582 0.217395 0.739999 -0.786428 0.680226 -0.498083
2015-08-31 0.147542 0.672995 0.318547 0.279269 0.489103 0.808526 0.225413 0.004063 0.525453 -0.354447 -0.039278 0.209834 0.319423 -0.583114 -0.221349
2015-09-30 0.663309 0.784415 0.460139 0.792484 0.114094 0.731929 0.810777 0.381041 0.121106 -0.324276 0.332345 -0.678390 0.617835 0.078848 -0.429736
2015-10-31 0.638421 0.705389 0.022883 0.147137 0.876246 0.868816 0.902057 0.030144 0.066968 -0.682506 0.124254 0.729109 -0.007430 0.033241 -0.871913
2015-11-30 0.468480 0.888482 0.061717 0.352941 0.508728 0.905883 0.267931 0.680066 0.420003 -0.826766 0.291225 0.155786 0.397155 -0.637952 0.412135
2015-12-31 0.373209 0.891520 0.915866 0.979559 0.718712 0.421039 0.182262 0.460243 0.518311 0.024345 0.063693 -0.260847 -0.297673 -0.238777 0.277982
# if you want to subtract column 0, from column 1 to 7
# we will call that delta2
# I like to use the methods: add(), sub(), mul() etc.
# The key thing is that data[0] becomes a series and broadcasts across the frame, but the index labels on the row axis connect up.
#
delta2 = data.iloc[:,1:].sub(data[0],axis=0)
delta2.columns = ['Excess_' + str(col) for col in delta2.columns]
data.join(delta2)发布于 2015-12-17 16:36:14
您可以使用简单的联系人将它们都放在一个dataframe中,然后使用for循环执行操作,并根据您的需求创建新列。
In [1]: import pandas as pd
# Creating a dummy data for illustration
In [2]: s_names = pd.Series(['a','b','c','d','f'], name = 'name')
In [3]: s1 = pd.Series([1,2,3,4,5], name = 's1')
In [4]: s2 = pd.Series([10,20,30,40,50], name='s2')
In [5]: s3 = pd.Series([100,200,300,400,500], name='s3')
# Use contact to create a new dataframe consisting of all seires
In [6]: data = pd.concat([s_names, s1, s2, s3], axis=1)
In [7]: data_columns = data.columns
In [8]: from itertools import combinations
# Generate a combination of columns for which you perform certain operations.
You can also have a custom list here.
In [9]: comb = list(combinations(data_columns[1:], 2))
In [10]: for c in comb:data[c[1]+"_"+c[0]] = data[c[1]] - data[c[0]]
In [11]: data
Out[11]:
name s1 s2 s3 s2_s1 s3_s1 s3_s2
0 a 1 10 100 9 99 90
1 b 2 20 200 18 198 180
2 c 3 30 300 27 297 270
3 d 4 40 400 36 396 360
4 f 5 50 500 45 495 450对于相关性,可以使用pandas.DataFrame.corr()
In [12]: data.corr()
Out[12]:
s1 s2 s3 s2_s1 s3_s1 s3_s2
s1 1 1 1 1 1 1
s2 1 1 1 1 1 1
s3 1 1 1 1 1 1
s2_s1 1 1 1 1 1 1
s3_s1 1 1 1 1 1 1
s3_s2 1 1 1 1 1 1https://stackoverflow.com/questions/34339111
复制相似问题