文章/答案/技术大牛

发布

社区首页 >问答首页 >将Series传递给Series.map()时的Series.map值

问将Series传递给Series.map()时的Series.map值
EN

Stack Overflow用户

提问于 2019-12-09 09:30:35

回答 1查看 502关注 0票数 1

我可能走错路了。我正在寻找大约100家英国医院的邮政编码。我有一个电子表格(all_all)在英国的医院/诊所/等的总数(14,000)和他们的地址和邮政编码。

在这100家医院中，我每年都有一个手术活动的数据(脊柱)，医院名称重复了2817行。

spine.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2818 entries, 0 to 2817
Data columns (total 7 columns):
index_col       2818 non-null float64
fyear           2818 non-null int64
NNAPID          2818 non-null int64
mainspef        2818 non-null int64
Trust           2818 non-null object
complexcount    2818 non-null float64
simplecount     2818 non-null float64
dtypes: float64(3), int64(3), object(1)
memory usage: 154.2+ KB

我想我可以用熊猫系列图。

进口csv，包括所有14,000家医院。

postcodes_all = pd.read_csv('all_all.csv')

postcodes_all.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14206 entries, 0 to 14205
Data columns (total 3 columns):
Unnamed: 0     14206 non-null int64
Trust_title    14206 non-null object
postcode       14206 non-null object
dtypes: int64(1), object(2)
memory usage: 333.1+ KB

在英国，医院是信托基金，所以在我的数据(脊柱)栏中，医院名称=信任。我正试图将其映射到postcodes_all (Trust_title)中的医院条目。

     spine['Trust'].map(postcodes_all['Trust_title'])
        0       NaN
1       NaN
2       NaN
3       NaN
4       NaN
       ... 
2813    NaN
2814    NaN
2815    NaN
2816    NaN
2817    NaN
Name: Trust, Length: 2818, dtype: object

它没有找到任何匹配的。医院领域如利兹教学医院，国民健康服务信托，同样的条目是在postcodes_all。

有没有办法探讨它失败的原因？我是一名医生，试图学习蟒蛇和熊猫的数据分析，因此，许多早期的步骤。

我不确定它是否失败，我只是在某个地方定义了错误的数据类型，或者我试图匹配两个本质上不同的列，并且希望能够检查我失败的代码。

抱歉，在我急急忙忙去诊所的时候，手术既含糊又简洁。

更新.

根据乔下面的评论，我简化了一些事情。我从脊柱csv中将列定义为“Trust”，在邮政编码csv中，我将索引列定义为“Trust”。

我现在肯定是比较脊柱上的医院名称和后码中的索引字段，我现在在“信任”中有一个关键的错误。

我的密码在这里

import pandas as pd

spine = pd.read_csv('~/Dropbox/Work/NNAP/Spine/Kate_W/kate_spine2.csv', usecols = ['Trust'])



spine.head()

Trust
0   THE WALTON CENTRE NHS FOUNDATION TRUST
1   CAMBRIDGE UNIVERSITY HOSPITALS NHS FOUNDATION ...
2   KING'S COLLEGE HOSPITAL NHS FOUNDATION TRUST
3   LEEDS TEACHING HOSPITALS NHS TRUST
4   NT424

postcodes_all = pd.read_csv('all_all.csv', index_col = 'Trust')


postcodes_all.head()

    Unnamed: 0  postcode
Trust       
MANCHESTER UNIVERSITY NHS FOUNDATION TRUST  0   M13 9WL
SOUTH TYNESIDE AND SUNDERLAND NHS FOUNDATION TRUST  1   SR4 7TP
WORCESTERSHIRE HEALTH AND CARE NHS TRUST    2   WR5 1JR
SOLENT NHS TRUST    3   SO19 8BR
SHROPSHIRE COMMUNITY HEALTH NHS TRUST   4   SY3 8XL

为了确保我使用的是系列而不是数据，我在代码中添加了“信任”，如下所示。

map1 = spine['Trust'].map(postcodes_all['Trust'])

KeyError                                  Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2896             try:
-> 2897                 return self._engine.get_loc(key)
   2898             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Trust'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-68-921448f7c401> in <module>
----> 1 map1 = spine['Trust'].map(postcodes_all['Trust'])

~/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2993             if self.columns.nlevels > 1:
   2994                 return self._getitem_multilevel(key)
-> 2995             indexer = self.columns.get_loc(key)
   2996             if is_integer(indexer):
   2997                 indexer = [indexer]

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2897                 return self._engine.get_loc(key)
   2898             except KeyError:
-> 2899                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2900         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2901         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Trust'

我不知道为什么这是不正确的，关键的错误意味着什么。

python

pandas

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-12-09 16:19:11

获取所有NaN值的原因是，spine['Trust']中没有一个值在postcodes_all['Trust_title']索引中找到。map()用于用新值替换旧值。它需要一个键值对来知道在替换每个旧值时要使用哪个新值。对于一个系列，它使用索引作为键，使用单个序列列作为值。

关于如何在这种情况下进行调试的技巧，是尝试使用一个更简单的示例，例如您链接的熊猫文档中的一个。下面是一个例子。

import pandas as pd

s = pd.Series(['cat', 'dog', 'rabbit'])
s

## Output
0       cat
1       dog
2    rabbit
dtype: object

s2 = pd.Series(['carnivore', 'omnivore', 'herbivore'])
s2

## Output
0    carnivore
1     omnivore
2    herbivore
dtype: object

s.map(s2)

## Output
0    NaN
1    NaN
2    NaN
dtype: object

NaN被返回，因为熊猫无法在s中的值和s2中的索引之间找到任何匹配值。将s2的索引设置为s的值可以解决这个问题。

# Set the values from `s` as the index in `s2`
s2.index = s
s2

## Output
cat       carnivore
dog        omnivore
rabbit    herbivore
dtype: object

s.map(s2)

## Output
0    carnivore
1     omnivore
2    herbivore
dtype: object

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/59245951

复制

相似问题

问将Series传递给Series.map()时的Series.map值
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将Series传递给Series.map()时的Series.map值EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将Series传递给Series.map()时的Series.map值
EN