首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >将数据列表与CSV文件进行比较并对匹配进行排序

将数据列表与CSV文件进行比较并对匹配进行排序
EN

Stack Overflow用户
提问于 2018-04-13 06:36:24
回答 1查看 266关注 0票数 1

我有一组产品名称和品牌清单。我需要在我的清单上找到多少品牌产品。

代码语言:javascript
复制
**Brands sample :** ['HM International', 'Sara', 'Wildcraft', 'Nike']
**Product name sample :** [Attache backpack11Green Waterproof Backpack
Simba BTSPOKEMON POKÈMON POKÈ BALLS 18 BP Waterproof S...
HM International HMHTPB 24304MK Waterproof Multipurpos...
Chris & Kate CKB_122SS Waterproof School Bag
Simba BTSPRINCESS FOLLOW YOUR DREAMS 16 BP Waterproof ...
Kuber Industries School Bag, Backpack Waterproof School...
Minnie Trio School Bag Waterproof School Bag
Thomas School Bag Waterproof School Bag
Sara Green 002 Shoulder Bag
Disney Frozen Anna & Elsa Pink Sequins 16' ' Backpack
Disney Princess Pink Flap 18' ' Backpack
My Baby Excel Peppa Side Sling Bag Sling Bag
Ranger Black School Bag with laptop compartment Waterpr...
HM International HMHTPB 73279AV Waterproof Multipurpos...
Peppa Peppa Pig Pink Plush Toy Wallet Round Shape Plush...
Disney Frozen Anna & Elsa Pink Sequins 14' ' Backpack
Disney Frozen Magic Blue 16' ' School Bag
Good Friends stylish Waterproof School Bag
ZEVORA Pink 3D Design Children Travel & School Bag, 1 L...
Gleam A103 School Bag
SARA BAGS TG15 Waterproof Backpack
Despicable Me Favourite Subject School Bag 16 inches Tr...
AARIP LTB037 Waterproof School Bag
Simba BTSSMURFS FOOTBALL 18 BP Waterproof School Bag
Gleam JB0402C Waterproof School Bag
Simba BTSSMURFS SMURFETTE SINGING STAR 18 BP Waterproo... ]
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-04-13 07:03:35

我建议将str.findallword boundary regex一起用于搜索多个值,然后扁平嵌套列表并使用Counter

代码语言:javascript
复制
from collections import Counter

Brands = ['HM International', 'Sara', 'Wildcraft', 'Nike']
pat = r'\b{}\b'.format('|'.join(Brands))

d = Counter([y for x in df['Product'].str.findall(pat) for y in x])
print (d)

Counter({'HM International': 2, 'Sara': 1})

或者,如果希望输出Series,请使用Series.value_counts

代码语言:javascript
复制
s = pd.Series(np.concatenate(df['Product'].str.findall(pat))).value_counts()
print (s)
HM International    2
Sara                1
dtype: int64

设置

代码语言:javascript
复制
d = {'Product': ['Attache backpack11Green Waterproof Backpack', 'Simba BTSPOKEMON POKÈMON POKÈ BALLS 18 BP Waterproof S...', 'HM International HMHTPB 24304MK Waterproof Multipurpos...', 'Chris & Kate CKB_122SS Waterproof School Bag', 'Simba BTSPRINCESS FOLLOW YOUR DREAMS 16 BP Waterproof ...', 'Kuber Industries School Bag, Backpack Waterproof School...', 'Minnie Trio School Bag Waterproof School Bag', 'Thomas School Bag Waterproof School Bag', 'Sara Green 002 Shoulder Bag', "Disney Frozen Anna & Elsa Pink Sequins 16' ' Backpack", "Disney Princess Pink Flap 18' ' Backpack", 'My Baby Excel Peppa Side Sling Bag Sling Bag', 'Ranger Black School Bag with laptop compartment Waterpr...', 'HM International HMHTPB 73279AV Waterproof Multipurpos...', 'Peppa Peppa Pig Pink Plush Toy Wallet Round Shape Plush...', "Disney Frozen Anna & Elsa Pink Sequins 14' ' Backpack", "Disney Frozen Magic Blue 16' ' School Bag", 'Good Friends stylish Waterproof School Bag', 'ZEVORA Pink 3D Design Children Travel & School Bag, 1 L...', 'Gleam A103 School Bag', 'SARA BAGS TG15 Waterproof Backpack', 'Despicable Me Favourite Subject School Bag 16 inches Tr...', 'AARIP LTB037 Waterproof School Bag', 'Simba BTSSMURFS FOOTBALL 18 BP Waterproof School Bag', 'Gleam JB0402C Waterproof School Bag', 'Simba BTSSMURFS SMURFETTE SINGING STAR 18 BP Waterproo']}
df = pd.DataFrame(d)
print (df.head())
                                             Product
0        Attache backpack11Green Waterproof Backpack
1  Simba BTSPOKEMON POKÈMON POKÈ BALLS 18 BP Wate...
2  HM International HMHTPB 24304MK Waterproof Mul...
3       Chris & Kate CKB_122SS Waterproof School Bag
4  Simba BTSPRINCESS FOLLOW YOUR DREAMS 16 BP Wat...
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/49810707

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档