我有一个列,它包含国家和国家名称:
Name Region Value_1 etc.
Apple Penn State 5641561
Apple Boston State 21515151
Apple United States 5545645
etc.我想在太空后面拉绳子(“"),但我想保持美国的现状。
例如:
Name Region Value_1 etc.
Apple Penn 5641561
Apple Boston 21515151
Apple United States 5545645
etc.我怎么能这么做?我使用以下代码进行拆分:df['Region'] = df['Region'].str.split(' ').str[0]
发布于 2020-05-25 11:05:37
您可以使用Series.str.replace将系列中出现的模式替换为替换字符串:
df['Region'] = df['Region'].str.replace(r'(\sState)\b', '')结果:
# print(df)
Name Region Value_1
0 Apple Penn 5641561
1 Apple Boston 21515151
2 Apple United States 5545645发布于 2020-05-25 10:48:06
试试这个:
df = pd.DataFrame({'Name': ['Apple', 'Apple', 'Apple'], 'Region': ['Penn State', 'Boston State', 'United States']})
df['Region'] = df['Region'].apply(lambda x: x.replace('State', '') if x.split()[-1].strip() == 'State' else x)输出:
Name Region
0 Apple Penn
1 Apple Boston
2 Apple United States发布于 2020-05-25 11:29:55
替代使用np.where()
### Create DataFrame
df = pd.DataFrame({
'Name': ['Apple', 'Apple', 'Apple'],
'Region': ['Penn State', 'Boston State', 'United States'],
'Value_1': [5641561, 21515151, 554564]
})
### Using np.where()
df['Region'] = df['Region'].where(df['Region'].str.contains('United States'),
df['Region'].str.split(" ").str[0])
### Output
print(df)
Name Region Value_1
0 Apple Penn 5641561
1 Apple Boston 21515151
2 Apple United States 554564https://stackoverflow.com/questions/62000635
复制相似问题