我有一个python的2D列表,如下所示:
[['Xzavier Kaska', 1.04], ['Brent Barnaby', 1.13], ['Alena Holoien', 1.37],
['Sam Surey', 1.37], ['Kash Nocella', 1.55], ['Ezequiel Gerraughty', 1.57],
['Myah Linsley', 1.74], ['Jaelynn Dzur', 1.79], ['Alfredo Andrew', 1.83],
['Skylar Movius', 1.95], ['Raphael Nocella', 2.14], ['Alondra Wallace', 2.2],
['Clark Loomis', 2.3], ['Skylar Cvek', 2.36], ['Carson Racugno', 2.52],
['Kathy Viveros-aguilera', 2.62], ['Heaven Barnaby', 2.75],
['Rebekah\tSpartichino', 3.24], ['Semaj Abernathy', 3.35], ['Rylee Dalton', 3.38],
['Sterling Grove', 3.46], ['Rebekah Ghosh', 3.85]]其中索引0表示学生姓名,索引1表示每个学生的gpa。
我想在上面的数据中根据gpa的增量将两对组合在一起。例如:按增量1.0分组。拥有gpa 0.0-1.0的学生被组织在一起,例如[0.0-0.1), [0.1-0.2), ...[3.9-4.0];[0.0-0.5],[0.5-1.0),...[3.5-4];[0-1),[1,2),[2,3),[3,4)。
发布于 2020-03-28 16:03:44
你可以创建一个字典,并将key设置为一个范围,比如如果你想要dict的范围0-1键是1,dict的范围4-5键是5等等,那么我们可以创建一个函数,以一定的速度对学生进行分组:
import math
l = [['Xzavier Kaska', 1.04], ['Brent Barnaby', 1.13], ['Alena Holoien', 1.37],
['Sam Surey', 1.37], ['Kash Nocella', 1.55], ['Ezequiel Gerraughty', 1.57],
['Myah Linsley', 1.74], ['Jaelynn Dzur', 1.79], ['Alfredo Andrew', 1.83],
['Skylar Movius', 1.95], ['Raphael Nocella', 2.14], ['Alondra Wallace', 2.2],
['Clark Loomis', 2.3], ['Skylar Cvek', 2.36], ['Carson Racugno', 2.52],
['Kathy Viveros-aguilera', 2.62], ['Heaven Barnaby', 2.75],
['Rebekah\tSpartichino', 3.24], ['Semaj Abernathy', 3.35], ['Rylee Dalton', 3.38],
['Sterling Grove', 3.46], ['Rebekah Ghosh', 3.85]]
def group(l : list):
"""
Group list students by their gpa with pace 1
:param l: input list of students and gpa's
:return: dictionary, where key is range: (key-1, key)
ex. if you want to get list of students with gpa 1 : d[1]
"""
d = {}
i = 0
for student, gpa in l:
index = math.ceil(gpa) - 1
if index < 1: index = 1
if index not in d.keys():
d[index] = [[student, gpa]]
else:
d[index].append([student, gpa])
i += 1
return d
d = group(l)
# checking:
for i in group(l):
print(f'range : {i-1} to {i}, list : {d[i]}')
# output will be:
# range : 0 to 1, list : [['Xzavier Kaska', 1.04], ['Brent Barnaby', 1.13], ['Alena Holoien', 1.37], ['Sam Surey', 1.37], ['Kash Nocella', 1.55], ['Ezequiel Gerraughty', 1.57], ['Myah Linsley', 1.74], ['Jaelynn Dzur', 1.79], ['Alfredo Andrew', 1.83], ['Skylar Movius', 1.95]]
# range : 1 to 2, list : [['Raphael Nocella', 2.14], ['Alondra Wallace', 2.2], ['Clark Loomis', 2.3], ['Skylar Cvek', 2.36], ['Carson Racugno', 2.52], ['Kathy Viveros-aguilera', 2.62], ['Heaven Barnaby', 2.75]]
# range : 2 to 3, list : [['Rebekah\tSpartichino', 3.24], ['Semaj Abernathy', 3.35], ['Rylee Dalton', 3.38], ['Sterling Grove', 3.46], ['Rebekah Ghosh', 3.85]]如果group不为空,您可以使用dicts检查某个组中的学生数量:
if 4 in d.keys():
print(len(d[4]))
else:
print('No students in such a group')发布于 2020-03-28 16:25:07
您可以使用pandas,这是数据科学家在python中使用的库。有大量的在线支持,所以你永远不会被卡住。
对于您的特定问题:
import pandas as pd
l = [['Xzavier Kaska', 1.04], ['Brent Barnaby', 1.13], ['Alena Holoien', 1.37],
['Sam Surey', 1.37], ['Kash Nocella', 1.55], ['Ezequiel Gerraughty', 1.57],
['Myah Linsley', 1.74], ['Jaelynn Dzur', 1.79], ['Alfredo Andrew', 1.83],
['Skylar Movius', 1.95], ['Raphael Nocella', 2.14], ['Alondra Wallace', 2.2],
['Clark Loomis', 2.3], ['Skylar Cvek', 2.36], ['Carson Racugno', 2.52],
['Kathy Viveros-aguilera', 2.62], ['Heaven Barnaby', 2.75],
['Rebekah\tSpartichino', 3.24], ['Semaj Abernathy', 3.35], ['Rylee Dalton', 3.38],
['Sterling Grove', 3.46], ['Rebekah Ghosh', 3.85]]
# Create a dataframe with your data.
df = pd.DataFrame(l, columns=['Name','GPA'])
# select the portion of dataframe in which the GPA is between 1 and 2
# (you can set your own parameters here)
df2 = df.loc[(df['GPA'] > 1) & (df['GPA'] < 2)]输出:
Name GPA
0 Xzavier Kaska 1.04
1 Brent Barnaby 1.13
2 Alena Holoien 1.37
3 Sam Surey 1.37
4 Kash Nocella 1.55
5 Ezequiel Gerraughty 1.57
6 Myah Linsley 1.74
7 Jaelynn Dzur 1.79
8 Alfredo Andrew 1.83
9 Skylar Movius 1.95如果您想要返回一个类似于您所拥有的列表:
list_1to2 = list(df2['Name'])
list_1to2.append('1-2')
print(list_1to2)
#repeat for each group...输出:
['Xzavier Kaska', 'Brent Barnaby', 'Alena Holoien', 'Sam Surey', 'Kash Nocella', 'Ezequiel Gerraughty', 'Myah Linsley', 'Jaelynn Dzur', 'Alfredo Andrew', 'Skylar Movius', '1-2']发布于 2020-03-28 16:12:43
我建议您使用Pandas对数据进行分组和操作:
import pandas as pd
step = 0.5
df = pd.DataFrame(gpas, columns=['name', 'gpa'])
df['group'] = df['gpa'].apply(lambda x:int(float(x) // step))在此之后,您可以创建您的组标签:
df['group_label'] = df['group'].apply(lambda x:'{}-{}'.format(x*step, (x+1)*step))结果如下:
df.head()
name gpa group group_label
0 Xzavier Kaska 1.04 2 1.0-1.5
1 Brent Barnaby 1.13 2 1.0-1.5
2 Alena Holoien 1.37 2 1.0-1.5
3 Sam Surey 1.37 2 1.0-1.5
4 Kash Nocella 1.55 3 1.5-2.0https://stackoverflow.com/questions/60898364
复制相似问题