我试图沿着一行连接文本,然后按ID分组
我有一个如下所示的数据集:
data=pd.DataFrame(data={'ID':['1','1','2','2','2','3','3','3','3'],
'Text1':['Apple','','','Laptop','','Pens','','Ruler',''],
'Text2': ['Bananas','Grape','Mouse','','DVD Player','','Pencils','',''],
'Text3':['Cherry','','','Headphones','','','','','Eraser'],
'Text4':['Mango','Strawberries','','','Cell phone','','Sticky Notes','','']
})
data =data.set_index('ID')
ID Text1 Text2 Text3 Text4
1 Apple Bananas Cherry Mango
1 Grape Strawberries
2 Mouse
2 Laptop Headphones
2 DVD-Player Cell-phone
3 Pens
3 Pencils Sticky Notes
3 Ruler
3 Eraser 我想要的手术:

对于如何实现这个输出,有什么想法吗?
发布于 2016-02-02 02:57:03
我建议使用DataFrame.groupby、DataFrame.apply和str.join的一些组合。根据您提供的内容,您可以使用以下内容。以下只是一个例子。
import pandas as pd
import re
data = pd.DataFrame(data={'ID':['1','1','2','2','2','3','3','3','3'],
'Text1':['Apple','','','Laptop','','Pens','','Ruler',''],
'Text2': ['Bananas','Grape','Mouse','','DVD Player','','Pencils','',''],
'Text3':['Cherry','','','Headphones','','','','','Eraser'],
'Text4':['Mango','Strawberries','','','Cell phone','','Sticky Notes','','']
})
cols = [x for x in data.columns if re.search("^Text", x)] # list of all columns
# that start with "Text"
# function to be applied that takes a row and a list of columns
# to concatenate
def concat_text(row, cols):
# The real work is done here
return ";".join([";".join([str(x) for x in y if x]) for y in row[cols].values])
result = data.groupby("ID").apply(concat_text, cols) # groupby and apply这会给你留下
ID
1 Apple;Bananas;Cherry;Mango;Grape;Strawberries
2 Mouse;Laptop;Headphones;DVD Player;Cell phone
3 Pens;Pencils;Sticky Notes;Ruler;Eraser
dtype: objecthttps://stackoverflow.com/questions/35144121
复制相似问题