我正在使用excel电子表格来填充字典。然后,我使用这些值将另一个数据帧的值乘以引用,但当我尝试时,它会给我带来错误。为了避免出错,我决定把excel电子表格写在字典里,但我没有成功。我这么做是因为字典最终会变长,而且编辑键和它的值太乏味了。我正在使用Python2.7
import pandas as pd
#READ EXCEL FILE
df = pd.read_excel("C:/Users/Pedro/Desktop/dataframe.xls")
#Store the keys with its value in a dictionary. This will become df2
d = {"M1-4":0.60,"M1-5/R10":0.85,"C5-3":0.85,"M1-5/R7-3":0.85,"M1-4/R7A":0.85,"R7A":0.85,"M1-4/R6A":0.85,"M1-4/R6B":0.85,"R6A":0.85,"PARK":0.20,"M1-6/R10":0.85,"R6B":0.85,"R9":0.85,"M1-5/R9":0.85}
#Convert the dictionary to an Excel spreadsheet
df5 = pd.DataFrame.from_dict(d, orient='index')
df5.to_excel('bob_dict.xlsx')
#populatethe dictionary from the excel spreadsheet
df2 = pd.read_excel("C:/Users/Pedro/Desktop/bob_dict.xlsx")
#Convert dtframe back to a dictionary
dictionary = df2.to_dict(orient='dict')
#Pass the dictionary as reference
b = df.filter(like ='Value').values
c = df.filter(like ='ZONE').replace(dictionary).astype(float).values
df['pro_cum'] = ((c * b).sum(axis =1))当运行此命令时,我得到了ValueError:无法将R6B字符串转换为浮动。
c = df.filter(like ='ZONE').replace(d).astype(float).values但是,如果我用原始字典替换区域值,它将运行,不会出错。
投入: df
HP ZONE Value ZONE1 Value1
3 R7A 0.7009 M1-4/R6B 0.00128
2 R6A 0.5842 M1-4/R7A 0.00009
7 M1-6/R10 0.1909 M1-4/R6A 0.73576
9 R6B 0.6919 PARK 0.03459
6 PARK 1.0400 M1-4/R6A 0.33002
9.3 M1-4/R6A 0.7878 PARK 0.59700
10.6 M1-4/R6B 0.0291 R6A 0.29621
11.9 R9 0.0084 M1-4 0.00058
13.2 M1-5/R10 0.0049 M1-4 0.65568
14.5 M1-4/R7A 0.0050 C5-3 0.00096
15.8 M1-5/R7-3 0.0189 C5-3 1.59327
17.1 M1-5/R9 0.3296 M1-4/R6B 0.43918
18.4 C5-3 0.5126 R6B 0.20835
19.7 M1-4 0.5126 PARK 0.22404发布于 2018-07-01 00:38:33
我解决了我的问题。当我将字典转换为数据帧时,键就变成索引,所以当我将数据帧转换回字典时,我最终会得到一本字典。所以我不得不在替换方法中说明这一点。
{0: {'M1-4': 0.6, 'M1-5/R10': 0.85, 'C5-3': 0.85,
'M1-5/R7-3': 0.85, 'M1-4/R7A': 0.85, 'R7A': 0.85,
'M1-4/R6A': 0.85, 'M1-4/R6B': 0.85, 'R6A': 0.85,
'PARK': 0.2, 'M1-6/R10': 0.85, 'R6B': 0.85,
'R9': 0.85, 'M1-5/R9': 0.85
}
}所以我编辑了这行代码并添加了
c = df.filter(like='ZONE').replace(dictionary[0]).astype(float).values发布于 2018-06-30 17:24:13
字典d之外的一些值存在问题(错误例如R6B,但可能有更多的值),因此不可能转换浮点数。
您可以找到这个值:
#create Series from all Zone columns
vals = df.filter(like ='ZONE').replace(d).stack()
#for non numeric return NaNs, so filtering return problematic values
out = vals[pd.to_numeric(vals, errors= 'coerce').isnull()].unique()
print (out)然后添加到字典d以避免此错误。
示例:
print (df)
HP ZONE Value ZONE1 Value1
0 3.0 R7A 0.7009 M1-4/R6B 0.00128
1 2.0 R6A 0.5842 M1-4/R7A 0.00009
2 7.0 M1-6/R10 0.1909 M1-4/R6A 0.73576
3 9.0 R6B 0.6919 PARK 0.03459
4 6.0 PARK 1.0400 M1-4/R6A 0.33002
5 9.3 M1-4/R6A 0.7878 PARK 0.59700
6 10.6 M1-4/R6B 0.0291 R6A 0.29621
7 11.9 R9 0.0084 M1-4 0.00058
8 13.2 M1-5/R10 0.0049 M1-4 0.65568
9 14.5 M1-4/R7A 0.0050 C5-3 0.00096
10 15.8 M1-5/R7-3 0.0189 C5-3 1.59327
11 17.1 M1-5/R9 0.3296 M1-4/R6B 0.43918
12 18.4 C5-3 0.5126 R6B 0.20835
13 19.7 M1-4 0.5126 PARK1 0.22404 <- added PARK1 for testing
d = {"M1-4":0.60,"M1-5/R10":0.85,"C5-3":0.85,"M1-5/R7-3":0.85,"M1-4/R7A":0.85,"R7A":0.85,"M1-4/R6A":0.85,"M1-4/R6B":0.85,"R6A":0.85,"PARK":0.20,"M1-6/R10":0.85,"R6B":0.85,"R9":0.85,"M1-5/R9":0.85}
vals = df.filter(like ='ZONE').replace(d).stack()
out = vals[pd.to_numeric(vals, errors= 'coerce').isnull()].unique()
print (out)
['PARK1']https://stackoverflow.com/questions/51116957
复制相似问题