我有一个包含两列"Start_location"和"end_location"的Dataframe。我想从前面的两个列中创建一个名为"location"的新列,条件如下。
如果值为"start_location" == "end_location",则"location"的值将是前两列的值之一。否则,如果"start_location"和"end_location的值不同,则"Location"的值将为"start_location"-"end_location".
我想要的一个例子是这样的。
+---+--------------------+-----------------------+
| | Start_location | End_location |
+---+--------------------+-----------------------+
| 1 | Stratford | Stratford |
| 2 | Bromley | Stratford |
| 3 | Brighton | Manchester |
| 4 | Delaware | Delaware |
+---+--------------------+-----------------------+我想要的结果是这样。
+---+--------------------+-----------------------+--------------------+
| | Start_location | End_location | Location |
+---+--------------------+-----------------------+--------------------+
| 1 | Stratford | Stratford | Stratford |
| 2 | Bromley | Stratford | Brombley-Stratford |
| 3 | Brighton | Manchester | Brighton-Manchester|
| 4 | Delaware | Delaware | Delaware |
+---+--------------------+-----------------------+--------------------+如果有人能帮上忙,我会很高兴。
PS-如果这是一个非常基本的问题,请原谅。我已经在这个话题上提出了一些类似的问题,但没有取得进展。
发布于 2020-07-22 21:10:13
您可以创建自己的函数来执行此操作,然后使用apply和一个lambda函数:
def get_location(start, end):
if start == end:
return start
else:
return start + ' - ' + end
df['location'] = df.apply(lambda x: get_location(x.Start_location, x.End_location), axis = 1)发布于 2020-07-22 21:10:16
df['Location'] = df[['start_location','end_location']].apply(lambda x: x[0] if x[0] == x[1] else x[0] + '-' + x[1], axis = 1)发布于 2020-07-22 21:15:15
使用np.select(condition, choice)。要加入start,请使用.str.cat()方法
import numpy as np
condition=[df['Start_location']==df['End_location'],df['Start_location']!= df['End_location']]
choice=[df['Start_location'], df['Start_location'].str.cat(df['End_location'], sep='_')]
df['Location']=np.select(condition, choice)
dfStart_location End_location Location
1 Stratford Stratford Stratford
2 Bromley Stratford Bromley_Stratford
3 Brighton Manchester Brighton_Manchester
4 Delaware Delaware Delawarehttps://stackoverflow.com/questions/63034866
复制相似问题