我有一个df列,其中包含一个段落文本,我已经创建了一个关键字列表。我希望将关键字与列文本进行比较,然后返回匹配的单词。我举一个例子如下:
keywords = ['yellow', 'orange', 'purple', 'pink']
df = 'colours' : ['my favourite colour is purple but sometimes pink', 'I have a yellow dinosaur', 'all flowers are red']我运行了这个代码:
df['match_colours'] = df.apply(lambda x: True if any(word in x.colours for word in keywords) else False, axis =1)返回一个列,如果有匹配,返回True,如果没有匹配,返回False。我只需要一个额外的列,它将指定哪些单词匹配
谢谢!
发布于 2022-10-13 15:35:13
您可以使用列表理解添加列。
df['colour_res'] = [[i for i in keywords if i in row] for row in df.colours]
colours ... colour_res
0 my favourite colour is purple but sometimes pink ... [purple, pink]
1 I have a yellow dinosaur ... [yellow]
2 all flowers are red ... []
[3 rows x 3 columns]发布于 2022-10-13 15:51:18
def custom_func(x):
for i in keywords:
if i in x:
return i
return None
df.col1 = df.colours.apply(custom_func)希望这行得通
https://stackoverflow.com/questions/74058219
复制相似问题