如果我有一个2000年的数据框架,其中品牌有142个独特的价值,我想计数每一个独特值的频率,从1到142,值应该动态变化。
brand=clothes_z.brand_name
brand.describe(include="all")
unique_brand=brand.unique()
brand.describe(include="all"),unique_brand输出:
(count 2613
unique 142
top Mango
freq 54
Name: brand_name, dtype: object,
array(['Jack & Jones', 'TOM TAILOR DENIM', 'YOURTURN', 'Tommy Jeans',
'Alessandro Zavetti', 'adidas Originals', 'Volcom', 'Pier One',
'Superdry', 'G-Star', 'SIKSILK', 'Tommy Hilfiger', 'Karl Kani',
'Alpha Industries', 'Farah', 'Nike Sportswear',
'Calvin Klein Jeans', 'Champion', 'Hollister Co.', 'PULL&BEAR',
'Nike Performance', 'Even&Odd', 'Stradivarius', 'Mango',
'Champion Reverse Weave', 'Massimo Dutti', 'Selected Femme Petite',
'NAF NAF', 'YAS', 'New Look', 'Missguided', 'Miss Selfridge',
'Topshop', 'Miss Selfridge Petite', 'Guess', 'Esprit Collection',
'Vero Moda', 'ONLY Petite', 'Selected Femme', 'ONLY', 'Dr.Denim',
'Bershka', 'Vero Moda Petite', 'PULL & BEAR', 'New Look Petite',
'JDY', 'Even & Odd', 'Vila', 'Lacoste', 'PS Paul Smith',
'Redefined Rebel', 'Selected Homme', 'BOSS', 'Brave Soul', 'Mind',
'Scotch & Soda', 'Only & Sons', 'The North Face',
'Polo Ralph Lauren', 'Gym King', 'Selected Woman', 'Rich & Royal',
'Rooms', 'Glamorous', 'Club L London', 'Zalando Essentials',
'edc by Esprit', 'OYSHO', 'Oasis', 'Gina Tricot',
'Glamorous Petite', 'Cortefiel', 'Missguided Petite',
'Missguided Tall', 'River Island', 'INDICODE JEANS',
'Kings Will Dream', 'Topman', 'Esprit', 'Diesel', 'Key Largo',
'Mennace', 'Lee', "Levi's®", 'adidas Performance', 'jordan',
'Jack & Jones PREMIUM', 'They', 'Springfield', 'Benetton', 'Fila',
'Replay', 'Original Penguin', 'Kronstadt', 'Vans', 'Jordan',
'Apart', 'New look', 'River island', 'Freequent', 'Mads Nørgaard',
'4th & Reckless', 'Morgan', 'Honey punch', 'Anna Field Petite',
'Noisy may', 'Pepe Jeans', 'Mavi', 'mint & berry', 'KIOMI', 'mbyM',
'Escada Sport', 'Lost Ink', 'More & More', 'Coffee', 'GANT',
'TWINTIP', 'MAMALICIOUS', 'Noisy May', 'Pieces', 'Rest',
'Anna Field', 'Pinko', 'Forever New', 'ICHI', 'Seafolly', 'Object',
'Freya', 'Wrangler', 'Cream', 'LTB', 'G-star', 'Dorothy Perkins',
'Carhartt WIP', 'Betty & Co', 'GAP', 'ONLY Tall', 'Next', 'HUGO',
'Violet by Mango', 'WEEKEND MaxMara', 'French Connection'],
dtype=object))因为它只显示芒果"54“的频率,因为它是最高频率,我想要每一个值的频率,比如Jack & Jones、TOM TAILOR DENIM和YOURTURN的频率等等。价值应该是动态变化的。
发布于 2019-11-14 07:33:58
你可以简单的做,
clothes_z.brand_name.value_counts()这将列出唯一的值,并将给你的频率的每一个元素在该潘达斯系列。
发布于 2019-11-14 07:12:24
from collections import Counter
ll = [...your list of brands...]
c = Counter(ll)
# you can do whatever you want with your counted values
df = pd.DataFrame.from_dict(c, orient='index', columns=['counted'])https://stackoverflow.com/questions/58851190
复制相似问题