文章/答案/技术大牛

发布

社区首页 >问答首页 >使用pandas/python从DataFrame中的两个现有文本列创建新列

问使用pandas/python从DataFrame中的两个现有文本列创建新列
EN

Stack Overflow用户

提问于 2020-07-22 21:01:59

回答 4查看 70关注 0票数 0

我有一个包含两列"Start_location"和"end_location"的Dataframe。我想从前面的两个列中创建一个名为"location"的新列，条件如下。

如果值为"start_location" == "end_location"，则"location"的值将是前两列的值之一。否则，如果"start_location"和"end_location的值不同，则"Location"的值将为"start_location"-"end_location".

我想要的一个例子是这样的。

+---+--------------------+-----------------------+
|   |  Start_location    |      End_location     |
+---+--------------------+-----------------------+
| 1 | Stratford          |      Stratford        |
| 2 | Bromley            |      Stratford        |
| 3 | Brighton           |      Manchester       |
| 4 | Delaware           |      Delaware         |
+---+--------------------+-----------------------+

我想要的结果是这样。

+---+--------------------+-----------------------+--------------------+
|   |  Start_location    |      End_location     |   Location         |
+---+--------------------+-----------------------+--------------------+
| 1 | Stratford          |      Stratford        |   Stratford        |
| 2 | Bromley            |      Stratford        | Brombley-Stratford |
| 3 | Brighton           |      Manchester       | Brighton-Manchester|
| 4 | Delaware           |      Delaware         |    Delaware        |
+---+--------------------+-----------------------+--------------------+

如果有人能帮上忙，我会很高兴。

PS-如果这是一个非常基本的问题，请原谅。我已经在这个话题上提出了一些类似的问题，但没有取得进展。

python

pandas

dataframe

for-loop

回答 4

Stack Overflow用户

发布于 2020-07-22 21:10:13

您可以创建自己的函数来执行此操作，然后使用apply和一个lambda函数：

def get_location(start, end):
    if start == end:
        return start
    else:
        return start + ' - ' + end

df['location'] = df.apply(lambda x: get_location(x.Start_location, x.End_location), axis = 1)

票数 1

Stack Overflow用户

发布于 2020-07-22 21:10:16

df['Location'] = df[['start_location','end_location']].apply(lambda x: x[0] if x[0] == x[1] else x[0] + '-' + x[1], axis = 1)

票数 1

Stack Overflow用户

发布于 2020-07-22 21:15:15

使用np.select(condition, choice)。要加入start，请使用.str.cat()方法

import numpy as np

condition=[df['Start_location']==df['End_location'],df['Start_location']!= df['End_location']]
choice=[df['Start_location'], df['Start_location'].str.cat(df['End_location'], sep='_')]
df['Location']=np.select(condition, choice)

df

Start_location End_location             Location
1      Stratford    Stratford            Stratford
2        Bromley    Stratford    Bromley_Stratford
3       Brighton   Manchester  Brighton_Manchester
4       Delaware     Delaware             Delaware

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/63034866

复制

相似问题

问使用pandas/python从DataFrame中的两个现有文本列创建新列
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用pandas/python从DataFrame中的两个现有文本列创建新列EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用pandas/python从DataFrame中的两个现有文本列创建新列
EN