我有两个数据,一个有参考数据,一个有“实验”数据。我想通过减去参考值来计算与实验值相关的误差。然而,实验的DataFrame是长形式的,并且包含相同索引的多个变量。我只想匹配索引,以便有时在减法中使用相同的参考值。这两个数据中的指数都是“反应”。
具体来说,我想在实验数据中创建两个名为"BSE“和"BSE_CP”的新列。这些值应按下列伪代码所示进行计算
experimental['BSE'] = experimental['Delta_E'] - reference['Delta_E']
experimental['BSE_CP'] = experimental['Delta_E_CP'] - reference['Delta_E']当然,我已经尝试了上面的代码,但是它返回一个ValueError:
ValueError: cannot reindex from a duplicate axis我可以做一些手工操作,循环各种基本集合并计算错误,然后将这些错误存储在临时列表中,最后将级联的数据作为一个新变量分配。下面的代码工作,,但我的(有限的)熊猫直觉告诉我,有一个更简单的方法。
bses = []
bse_cps = []
for basis in exp.Basis_set.unique():
exp_sub = exp.loc[exp.Basis_set == basis]
bse = exp_sub['Delta_E'] - ref['Delta_E']
bse_cp = exp_sub['Delta_E_CP'] - ref['Delta_E']
bses.append(bse)
bse_cps.append(bse_cp)
exp['BSE'] = pd.concat(bses, axis=0)
exp['BSE_CP'] = pd.concat(bse_cps, axis=0)来自实验数据的样本:
Basis_set Functional Delta_E Delta_E_CP BSSE
Reaction
Cr-Alkene-1 pc-3 PBE -24.950271 -24.922770 0.027485
Cr-Alkene-2 pc-3 PBE -20.674572 -20.633017 0.041541
Cr-Alkene-3 pc-3 PBE -9.621059 -9.560187 0.060868
Cr-Alkene-4 pc-3 PBE -15.913920 -15.821342 0.092578
Cr-Alkene-5 pc-3 PBE -9.925094 -9.836789 0.088305
Cr-Alkene-6 pc-3 PBE -16.365306 -16.266877 0.098429
Cr-CO pc-3 PBE -43.738982 -43.698595 0.040412
Cr-H2 pc-3 PBE -19.050313 -19.054649 -0.004336
Cr-MeCN pc-3 PBE -29.415768 -29.384396 0.031375
Cr-MeOH pc-3 PBE -18.165318 -18.120964 0.044365
Cr-THF pc-3 PBE -19.518354 -19.486973 0.031375
Cr-Water pc-3 PBE -16.643343 -16.582746 0.060617
Fe-MeOH pc-3 PBE -14.514893 -14.432698 0.082196
Ni-Alkene-1 pc-3 PBE -16.365802 -16.323111 0.042671
Ni-Alkene-2 pc-3 PBE -12.029692 -11.976059 0.053652
Ni-Alkene-3 pc-3 PBE -6.764403 -6.670935 0.093468
Ni-Alkene-4 pc-3 PBE -9.027397 -8.934491 0.092907
Ni-Alkene-5 pc-3 PBE -6.373132 -6.259096 0.114035
Ni-Alkene-6 pc-3 PBE -9.282549 -9.182826 0.099723
Ni-CO pc-3 PBE -29.330640 -29.271458 0.059174
Ni-MeCN pc-3 PBE -16.075560 -16.034989 0.040600
Ni-MeOH pc-3 PBE -7.261460 -7.210546 0.050891
Ni-NHC-1 pc-3 PBE -36.680622 -36.615234 0.065388
Ni-NHC-2 pc-3 PBE -36.232223 -36.115631 0.116592
Ni-THF pc-3 PBE -8.198476 -8.157920 0.040537
Ni-Water pc-3 PBE -6.052186 -5.988283 0.063902
Cr-Alkene-1 6-311++G(2df,2pd) PBE -25.843776 -24.979298 0.864478
Cr-Alkene-2 6-311++G(2df,2pd) PBE -22.012592 -20.741707 1.270885
Cr-Alkene-3 6-311++G(2df,2pd) PBE -11.692782 -9.797260 1.895522
Cr-Alkene-4 6-311++G(2df,2pd) PBE -17.853916 -15.858710 1.995206
Cr-Alkene-5 6-311++G(2df,2pd) PBE -12.365642 -10.000622 2.365020
Cr-Alkene-6 6-311++G(2df,2pd) PBE -18.460674 -16.333490 2.127184
Cr-CO 6-311++G(2df,2pd) PBE -44.629594 -43.514245 1.115349
Cr-H2 6-311++G(2df,2pd) PBE -19.422439 -19.074368 0.348071
Cr-MeCN 6-311++G(2df,2pd) PBE -30.350801 -29.453226 0.897575
Cr-MeOH 6-311++G(2df,2pd) PBE -19.176455 -18.105223 1.071232
Cr-THF 6-311++G(2df,2pd) PBE -20.776291 -19.514903 1.261388
Cr-Water 6-311++G(2df,2pd) PBE -17.581627 -16.548968 1.032659
Fe-MeOH 6-311++G(2df,2pd) PBE -15.773194 -14.295214 1.477980
Ni-Alkene-1 6-311++G(2df,2pd) PBE -17.343889 -16.455705 0.888184
Ni-Alkene-2 6-311++G(2df,2pd) PBE -13.206122 -12.113601 1.092521
Ni-Alkene-3 6-311++G(2df,2pd) PBE -8.452805 -6.882294 1.570512
Ni-Alkene-4 6-311++G(2df,2pd) PBE -10.640900 -9.033991 1.606909
Ni-Alkene-5 6-311++G(2df,2pd) PBE -8.377379 -6.450635 1.926744
Ni-Alkene-6 6-311++G(2df,2pd) PBE -11.016182 -9.303782 1.712400
Ni-CO 6-311++G(2df,2pd) PBE -30.283896 -29.304677 0.979219
Ni-MeCN 6-311++G(2df,2pd) PBE -16.837946 -16.065847 0.772099
Ni-MeOH 6-311++G(2df,2pd) PBE -8.014220 -7.085813 0.928407
Ni-NHC-1 6-311++G(2df,2pd) PBE -38.170826 -36.886904 1.283922
Ni-NHC-2 6-311++G(2df,2pd) PBE -38.598700 -36.387356 2.211343
Ni-THF 6-311++G(2df,2pd) PBE -9.091911 -8.048848 1.043063
Ni-Water 6-311++G(2df,2pd) PBE -6.754093 -5.892542 0.861551参考数据:
Delta_E
Reaction
Cr-Alkene-1 -24.984980
Cr-Alkene-2 -20.698715
Cr-Alkene-3 -9.620706
Cr-Alkene-4 -15.898494
Cr-Alkene-5 -9.984087
Cr-Alkene-6 -16.350411
Cr-Water -16.612333
Cr-MeOH -18.159461
Cr-THF -19.541941
Cr-MeCN -29.429611
Cr-CO -43.758283
Cr-H2 -19.092310
Ni-Alkene-1 -16.326735
Ni-Alkene-2 -11.955749
Ni-Alkene-3 -6.644702
Ni-Alkene-4 -8.922958
Ni-Alkene-5 -6.323173
Ni-Alkene-6 -9.171335
Ni-Water -5.925627
Ni-MeOH -7.149769
Ni-THF -8.095105
Ni-MeCN -15.941426
Ni-CO -29.236219
Ni-NHC-1 -36.582247
Ni-NHC-2 -36.093587
Fe-MeOH -14.469599期望输出
Basis_set Functional Delta_E Delta_E_CP BSSE BSE BSE_CP
Reaction
Cr-Alkene-1 6-31+G(d) PBE -28.366635 -26.271858 2.094777 -3.381654 -1.286877
Cr-Alkene-2 6-31+G(d) PBE -24.810519 -21.984532 2.825986 -4.111804 -1.285817
Cr-Alkene-3 6-31+G(d) PBE -14.328868 -10.097466 4.231402 -4.708163 -0.476760
Cr-Alkene-4 6-31+G(d) PBE -21.041370 -16.296561 4.744809 -5.142876 -0.398067
Cr-Alkene-5 6-31+G(d) PBE -15.232350 -9.631952 5.600398 -5.248263 0.352135
...发布于 2021-02-13 09:31:33
我建议您将实验数据框架与反应Id上的参考数据框架合并。
import pandas as pd
import numpy as np
mergedData= pd.merge(ref,exp_sub, how='left' ,on='Reaction', suffixes=('_ref', '_exp'),indicator ='Exists')由于在两个数据帧中都有同名的列Delta_E,所以可以在合并时为列名指定后缀。这意味着合并的结果将有两列Delta_E_ref和Delta_E_exp。最后,该指示符存在,当反应id位于两个数据帧中时,它的值将为“二者”,这是您要减除的地方:
mergedData['bse']=np.nan
mergedData['bse_cp']=np.nan
mergedData['bse'] = np.where(mergedData['Exists']=='both',mergedData['Delta_E_ref'] - mergedData['Delta_E_exp'] , np.nan)
mergedData['bse_cp '] = np.where(mergedData['Exists']=='both',mergedData['Delta_E_CP'] - mergedData['Delta_E_exp'] , np.nan)
mergedData.drop('Exists',axis=1, inplace=True) ## droping the Exists column 如果您想知道更多的话,这是来自于合并函数的熊猫库的链接:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html
https://stackoverflow.com/questions/66183366
复制相似问题