我将csv转换为一个名为a的列表。我有一种方法可以通过条件对数据进行分类。问题是它不起作用。如果在我所有的Cliente上有任何叫做‘稳定’的元素,我把'Estable'的条件,这不是我需要的,但对于所有没有'Estable'作为AAA和BBB的客户端,我希望你把'NoAnalyzed'放在我解释的代码下面。
import pandas as pd
a = [['Cliente', 'Fecha', 'Variables', 'Dia Previo', 'Mayor/Menor', 'Dia a Analizar', 'Analisis'],
['AAA', '27/12/2017', 'ECPM_medio', '0.41', 'Dentro del Margen', '0.35', 'Incremento'],
['BBB', '27/12/2017', 'ECPM_medio', '1.06', 'Dentro del Margen', '1.06', 'Alerta'],
['CCC', '27/12/2017', 'ECPM_medio', '1.06', 'Dentro del Margen', '1.06', 'Estable']]
headers = a.pop(0)
df = pd.DataFrame(a, columns = headers)
df['Analisis']
for elemento in df['Analisis']:
if elemento == 'Estable':
df['Status'] = 'Stable: The client''s performance was Stable'
else:
df['Status'] = 'NoAnalyzed'
df1= df.groupby(['Cliente','Fecha', 'Status']).size()
df1
output:
>>>
Cliente Fecha Status
AAA 27/12/2017 Stable: The clients performance was Stable 1
BBB 27/12/2017 Stable: The clients performance was Stable 1
CCC 27/12/2017 Stable: The clients performance was Stable 1
I need:
>>>
Cliente Fecha Status
AAA 27/12/2017 NoAnalyzed 1
BBB 27/12/2017 NoAnalyzed 1
CCC 27/12/2017 Stable: The clients performance was Stable 1发布于 2018-01-02 23:21:11
我相信你需要numpy.where或map,因为在熊猫中最好避免循环,因为速度很慢:
mask = df['Analisis'] == 'Estable'
df['Status'] = np.where(mask, 'Stable: The client''s performance was Stable', 'NoAnalyzed')或类似的:
d = {True: 'Stable: The client''s performance was Stable',False: 'NoAnalyzed'}
df['Status'] = mask.map(d)
print (df)
Cliente Fecha Variables Dia Previo Mayor/Menor \
0 AAA 27/12/2017 ECPM_medio 0.41 Dentro del Margen
1 BBB 27/12/2017 ECPM_medio 1.06 Dentro del Margen
2 CCC 27/12/2017 ECPM_medio 1.06 Dentro del Margen
Dia a Analizar Analisis Status
0 0.35 Incremento NoAnalyzed
1 1.06 Alerta NoAnalyzed
2 1.06 Estable Stable: The clients performance was Stable 发布于 2018-01-02 23:01:39
问题是您直接将单个值赋给列,而不是列表/数组/系列。每行中都有一个值在自我复制。我建议你做一个列表,并将它分配到你的df‘’Status‘列中。
status=[]
for elemento in df['Analisis']:
if elemento == 'Estable'
status.append('Stable: The client''s performance was Stable')
else:
status.append('NoAnalyzed')
df['Status'] = status这应该是可行的。
https://stackoverflow.com/questions/48062858
复制相似问题