假设我们有两个列表Purchase和Product
Purchase = [
['James', 'Shoes', 1],
['James', 'T-shirt', 3],
['James', 'Pants', 2],
['James', 'Jacket', 1],
['James', 'Bag', 1],
['Neil', 'Shoes', 2],
['Neil', 'Bag', 1],
['Neil', 'Jacket', 1],
['Neil', 'Pants', 1],
['Chris', 'Hats', 1],
['Chris', 'T-shirt', 2],
['Chris', 'Shoes', 1],
['Chris', 'Pants', 2],
]
Product = [
['T-shirt', 110],
['Pants', 150],
['Shoes', 200],
['Hats', 150],
['Jacket', 250],
['Bag', 230],
] 在Purchase上,每个元素的第一个元素是买方的名称,第二个元素是他们购买什么产品,最后一个元素是他们购买了多少。论Product及其产品名称和价格
我想要做的是根据每一种产品的每一位购买者的购买量来创建一个新的列表,并将其从最高到最低排序,并且只占前3位。如果没有购买一个产品,它将被乘以零。为了便于理解,下面是计算:
For 'James': So the prices from expensive to cheap:
T-shirt -> 110*3 = 330 ['T-shirt', 'Pants', 'Jacket', 'Bag', 'Shoes', 'Hats']
Pants -> 150*2 = 300
Shoes -> 200*1 = 200
Hats -> 150*0 = 0
Jacket -> 250*1 = 250
Bag -> 230*1 = 230
For 'Neil':
T-shirt -> 110*0 = 0 ['Shoes', 'Jacket', 'Bag', 'Pants', 'T-shirt', 'Hats' ]
Pants -> 150*1 = 150
Shoes -> 200*2 = 400
Hats -> 150*0 = 0
Jacket -> 250*1 = 250
Bag -> 230*1 = 230
For 'Chris':
T-shirt -> 110*2 = 220 ['Pants', 'T-shirt', 'Shoes', 'Hats', 'Jacket', 'Bag']
Pants -> 150*2 = 300
Shoes -> 200*1 = 200
Hats -> 150*1 = 150
Jacket -> 250*0 = 0
Bag -> 230*0 = 0最后,这就是我所期望的:
Result = [
['James', 'T-shirt', 'Pants', 'Jacket'],
['Neil', 'Shoes','Jacket', 'Bag'],
['Chris', 'Pants', 'T-shirt', 'Shoes']]任何帮助都是值得感激的
发布于 2018-10-01 02:40:23
您可以在itertools.groupby中使用以下列表理解
from itertools import groupby
from operator import itemgetter
Result = [[k, *map(itemgetter(1), sorted((-p[i] * c, i) for _, i, c in g)[:3])] for p in (dict(Product),) for k, g in groupby(Purchase, key=itemgetter(0))]使用示例输入,Result将变成:
[['James', 'T-shirt', 'Pants', 'Jacket'], ['Neil', 'Shoes', 'Jacket', 'Bag'], ['Chris', 'Pants', 'T-shirt', 'Shoes']]上面的列表理解只是以下等价代码的一个更简洁的版本:
# convert the product pricing into a product-to-price dict for efficient lookup
pricing = dict(Product)
Result = []
# extract the groupings in Purchase based on the first item, the customer's name
for name, purchases in groupby(Purchase, key=itemgetter(0)):
costs = []
# for each of a customer's purchases, we calculate the cost by multiplying
# the product's pricing by the number purchased, and put the calculated cost
# and product name in a tuple so that it can be sorted by the cost first and
# then the customer name second; the cost should be negated so to sort
# in descending order
for _, product, count in purchases:
costs.append((-pricing[product] * count, product))
costs.sort()
# initialize the sub-list in the output, which starts with the customer's name
top_products = [name]
# followed by the top 3 products from the second item in the sorted costs list
for _, product in costs[:3]:
top_products.append(product)
# we've got a finished sub-list to output for the current customer
Result.append(top_products)发布于 2018-10-01 02:41:50
有很多方法,但首先想到的是。我认为比较平平的方法比长列表理解更容易理解和维护(尽管现在的另一个答案是聪明和简短的)。
首先,你似乎想保持名字的顺序。我认为字典是处理这类联系的一种自然方式,所以为了保持排序顺序,我会亲自使用有序字典来解决这个问题。此外,当您可以根据键值映射中的有效键查找事物时,Product更容易使用。因此,我们做以下工作:
from collections import OrderedDict
Product_kv = dict(Product)从那里开始,我们迭代了所有的采购,并保持一个映射的多少是花在每一项。
d = OrderedDict()
for person, item, n in Purchase:
if person not in d:
d[person] = {}
if item not in d[person]:
d[person][item] = 0
d[person][item] += n*Product_kv[item]如果你有负计数或价格,这不一定是正确的解决办法。根据要求,我们可以不大张旗鼓地考虑0的乘法:
for person in d:
for item in Product_kv:
if item not in d[person]:
d[person][item] = 0剩下的就是使用预先计算的总支出来提取你想要的分类数据。
[[name]+sorted(d[name], key=lambda s:d[name][s], reverse=True)[:3] for name in d]发布于 2018-10-01 02:58:04
纯Python方法将涉及字典和显式迭代。如果你喜欢使用第三方库,你可以使用Pandas:
import pandas as pd
# construct dataframe and series mapping
purchases = pd.DataFrame(Purchase)
products = pd.DataFrame(Product).set_index(0)[1]
# calculate value and sort
df = purchases.assign(value=purchases[2]*purchases[1].map(products))\
.sort_values('value', ascending=False)
# create dictionary or list result
res1 = {k: v[1].iloc[:3].tolist() for k, v in df.groupby(0, sort=False)}
res2 = [[k] + v[1].iloc[:3].tolist() for k, v in df.groupby(0, sort=False)]结果:
print(res1)
{'Neil': ['Shoes', 'Jacket', 'Bag'],
'James': ['T-shirt', 'Pants', 'Jacket'],
'Chris': ['Pants', 'T-shirt', 'Shoes']}
print(res2)
[['Neil', 'Shoes', 'Jacket', 'Bag'],
['James', 'T-shirt', 'Pants', 'Jacket'],
['Chris', 'Pants', 'T-shirt', 'Shoes']]https://stackoverflow.com/questions/52583876
复制相似问题