我有一个名为“countries”的json对象,如下所示,所有国家都有ISO代码列表:
countries = [{"name":"Afghanistan","alpha-2":"AF","country-code":"004"},{"name":"Åland Islands","alpha-2":"AX","country-code":"248"},{"name":"Albania","alpha-2":"AL","country-code":"008"},{"name":"Algeria","alpha-2":"DZ","country-code":"012"}]我有一个“国家”专栏的熊猫数据栏:
Country
--------
Albania
Algeria
Algeria我希望用json对象中的'alpha-2‘值替换Country列'name’。结果应该是:
Country
---------
AL
DZ
DZ我试图做这样的事情,它不会产生任何错误,也不会改变值。
df['Country'] = df['Country'].replace(lambda y: (x['alpha-2'] for x in countries) if y in (x['name'] for x in countries) else y)发布于 2019-02-01 17:09:57
Pandas不建议使用逐行的lambda,原因与pd.Series.apply是not recommended的原因相同。更好的方法是构造一个映射字典,然后使用矢量化的pd.Series.map。
# setup dataframe
df = pd.DataFrame({'Country': ['Albania', 'Algeria', 'Algeria']})
# construct mapping dictionary and apply mapping
mapper = {dct['name']: dct['alpha-2'] for dct in countries}
df['Country'] = df['Country'].map(mapper).fillna(df['Country'])
print(df)
# Country
# 0 AL
# 1 DZ
# 2 DZ发布于 2019-02-01 17:16:16
您可以这样做,使用{country:country_code}创建一个新的country_to_country_code= {v['name']:v['alpha-2'] for v in countries}字典模式,然后使用这个country_to_country_code字典来map()您的Country列。
import pandas as pd
df = pd.DataFrame({"Country":["Albania", "Algeria", "Algeria"]})
countries = [{"name":"Afghanistan","alpha-2":"AF","country-code":"004"},{"name":"Åland Islands","alpha-2":"AX","country-code":"248"},{"name":"Albania","alpha-2":"AL","country-code":"008"},{"name":"Algeria","alpha-2":"DZ","country-code":"012"}]
country_to_country_code= {v['name']:v['alpha-2'] for v in countries}
df.loc[:, 'Country'] = df['Country'].map(country_to_country_code)
print(df)输出
Country
0 AL
1 DZ
2 DZ发布于 2019-02-01 17:03:24
您正在访问df['Country']中的列df['Country'],所以如果您有其他字段,比如有问题的alpha-2,那么为什么不简单地使用df['Country']=df['alpha-2'],因为它无论如何都比lambda更快?
https://stackoverflow.com/questions/54483898
复制相似问题