cols = [2,4,6,8,10,12,14,16,18] # selected the columns i want to work with
df = pd.read_csv('mywork.csv')
df1 = df.iloc[:, cols]
b= np.array(df1)
b结果
array([['WV5 6NY', 'RE4 9VU', 'BU4 N90', 'TU3 5RE', 'NE5 4F'],
['SA8 7TA', 'BA31 0PO', 'DE3 2FP', 'LR98 4TS', nan],
['MN0 4NU', 'RF5 5FG', 'WA3 0MN', 'EA15 8RE', 'BE1 4RE'],
['SB7 0ET', 'SA7 0SB', 'BT7 6NS', 'TA9 0LP' nan]], dtype=object)
a = np.concatenate(b) #concatenated to get a single array, this worked well
print(np.sort(a)) # to sort alphabetically
it gave me error **error AxisError: axis -1 is out of bounds for array of dimension 0*
I also tried using a.sort() it is also giving me **TypeError: '<' not supported between instances of 'float' and 'str'**以上是一份CSV文件,内载不同人士的邮编清单,其中包括从一个邮编到另一个邮编的不同工作,一个人每天可乘搭5班邮递员。使用numpy数组,我得到了邮政编码列表。
然后,我将邮政编码列表连在一起,得到一个大的邮政编码列表,之后我想按字母顺序对其排序,但它一直给我带来错误。
拜托,有人能帮忙吗?
发布于 2021-11-05 00:00:15
正如注释中提到的,此错误是由nan与string的比较引起的。要解决这个问题,不能使用NumPy数组(用于排序),而只能使用列表。
将数组转换为列表的values
# Get the data (in your scenario, this would be achieved by reading from your file)
b = np.array([['WV5 6NY', 'RE4 9VU', 'BU4 N90', 'TU3 5RE', 'NE5 4F'],
['SA8 7TA', 'BA31 0PO', 'DE3 2FP', 'LR98 4TS', nan],
['MN0 4NU', 'RF5 5FG', 'WA3 0MN', 'EA15 8RE', 'BE1 4RE'],
['SB7 0ET', 'SA7 0SB', 'BT7 6NS', 'TA9 0LP', nan]], dtype=object)
# Flatten
a = np.concatenate(b)
# Remove nan values - they are converted to strings when concatenated
a = np.array([x for x in a if x != 'nan'])
# Finally, sort
a.sort()https://stackoverflow.com/questions/69846411
复制相似问题