首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何多次替换Pandas列中的值?

如何多次替换Pandas列中的值?
EN

Stack Overflow用户
提问于 2019-02-11 08:50:25
回答 2查看 269关注 0票数 3

我有一个数据文件df1

代码语言:javascript
复制
Questions                             Purpose
what is scientific name of <input>    scientific name
what is english name of <input>       english name

我有两份清单如下:

代码语言:javascript
复制
name1 = ['salt','water','sugar']
name2 = ['sodium chloride','dihydrogen monoxide','sucrose']

我希望通过将<input>替换为列表中的值来创建一个新的dataframe,这取决于目的。

如果目的是英文名称,则将<input>替换为name2中的值,否则将<input>替换为name1

预期输出DataFrame:

代码语言:javascript
复制
Questions                                   Purpose
what is scientific name of salt             scientific name
what is scientific name of water            scientific name
what is scientific name of sugar            scientific name
what is english name of sodium chloride     english name
what is english name of dihydrogen monoxide english name
what is english name of sucrose             english name

我的努力

代码语言:javascript
复制
questions = []
purposes = []

for i, row in df1.iterrows():
    if row['Purpose'] == 'scientific name':
        for name in name1:
            ques = row['Questions'].replace('<input>', name)
            questions.append(ques)
            purposes.append(row['Purpose'])
    else:
        for name in name2:
           ques = row['Questions'].replace('<input>', name)
           questions.append(ques)
           purposes.append(row['Purpose'])

df = pd.DataFrame({'Questions':questions, 'Purpose':purposes})

上面的代码产生预期的输出。但是它太慢了,因为我在最初的dataframe上有很多问题。(我也有多种目的,但就目前而言,我只坚持两个)。

我正在寻找一个更有效的解决方案,可以摆脱for循环。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-02-11 09:04:53

可以这样做的一种方法是使用列表理解遍历Questions,并用相应的name替换<input>。为了尽可能多地重复每个Question (在namesx中的字段),您可以使用itertools.cycle

代码语言:javascript
复制
from itertools import cycle

names = [name1, name2]
new = [[i.replace('<input>', j), purpose] 
                       for row, purpose, name in zip(df.Questions, df.Purpose, names) 
                       for i,j in zip(cycle([row]), name)]

pd.DataFrame(new, columns=df.columns) 

                                    Questions          Purpose
0              what is scientific name of salt  scientific name
1             what is scientific name of water  scientific name
2             what is scientific name of sugar  scientific name
3      what is english name of sodium chloride     english name
4  what is english name of dihydrogen monoxide     english name
5              what is english name of sucrose     english name
票数 2
EN

Stack Overflow用户

发布于 2019-02-11 09:21:20

我使用pd.concat()进行了如下操作,您可以尝试:

代码语言:javascript
复制
names = name1+name2
df_new = pd.concat([df.loc[df.Purpose.eq('scientific name')]]*len(name1))\
    .append(pd.concat([df.loc[df.Purpose.eq('english name')]]*len(name2)),ignore_index=True)

for e,i in enumerate(names):
    df_new.Questions.loc[e]=df_new.Questions.loc[e].replace('<input>',i)
print(df_new)

                                     Questions          Purpose
0              what is scientific name of salt  scientific name
1             what is scientific name of water  scientific name
2             what is scientific name of sugar  scientific name
3      what is english name of sodium chloride     english name
4  what is english name of dihydrogen monoxide     english name
5              what is english name of sucrose     english name
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/54626794

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档