我正在加载一个包含数据的本地csv文件。我试图在行中找到NaN和数字混合的最小浮点数。
我试过使用名为np.nanmin的numpy函数,但它抛出:
"TypeError: '<=' not supported between instances of 'str' and 'float'"database = pd.read_csv('database.csv',quotechar='"',skipinitialspace=True, delimiter=',')
coun_weight = database[['Country of Operator/Owner', 'Launch Mass (Kilograms)']]
print(coun_weight)
lightest = np.nanmin(coun_weight['Launch Mass (Kilograms)'])对于为什么nanmin不起作用,有什么建议吗?
指向整个csv文件的链接:http://www.sharecsv.com/s/5aea6381d1debf75723a45aacd40abf8/database.csv
下面是我的coun_weight示例:
Country of Operator/Owner Launch Mass (Kilograms)
1390 China NaN
1391 China 1040
1392 China 1040
1393 China 2700
1394 China 2700
1395 China 1800
1396 China 2700
1397 China NaN
1398 China NaN
1399 China NaN
1400 China NaN
1401 India 92
1402 Russia 45
1403 South Africa 1
1404 China NaN
1405 China 4
1406 China 4
1407 China 12发布于 2017-03-21 11:32:07
我试着测试它,所有有问题的值都是:
coun_weight = pd.read_csv('database.csv')
print (coun_weight.loc[pd.to_numeric(coun_weight['Launch Mass (Kilograms)'], errors='coerce').isnull(), 'Launch Mass (Kilograms)'].dropna())
1091 5,000+
1092 5,000+
1093 5,000+
1094 5,000+
1096 5,000+
Name: Launch Mass (Kilograms), dtype: object解决办法是:
coun_weight['Launch Mass (Kilograms)'] =
coun_weight['Launch Mass (Kilograms)'].replace('5,000+', 5000).astype(float)
print (coun_weight['Launch Mass (Kilograms)'].iloc[1091:1098])
1091 5000.0
1092 5000.0
1093 5000.0
1094 5000.0
1095 NaN
1096 5000.0
1097 6500.0
Name: Launch Mass (Kilograms), dtype: float64如果需要,使用NaNs - Series.min查找最小值,其中跳过NaNs:
print (coun_weight['Launch Mass (Kilograms)'].min())
0.0测试某些0是否在列中:
a = coun_weight['Launch Mass (Kilograms)']
print (a[a == 0])
912 0.0
Name: Launch Mass (Kilograms), dtype: float64另一个可能的解决方案是将此值替换为NaNs:
coun_weight['Launch Mass (Kilograms)'] =
pd.to_numeric(coun_weight['Launch Mass (Kilograms)'], errors='coerce')
print (coun_weight['Launch Mass (Kilograms)'].iloc[1091:1098])
1091 NaN
1092 NaN
1093 NaN
1094 NaN
1095 NaN
1096 NaN
1097 6500.0
Name: Launch Mass (Kilograms), dtype: float64发布于 2017-03-21 11:28:16
试图将列转换为浮动会显式显示问题,但"5,000+“不会转换为‘float 64’。
coun_weight['Launch Mass (Kilograms)'].astype('float64')结果:
ValueError: invalid literal for float(): 5,000+https://stackoverflow.com/questions/42917344
复制相似问题