首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >pandas的数据输出不正确

pandas的数据输出不正确
EN

Stack Overflow用户
提问于 2021-06-02 03:05:33
回答 1查看 67关注 0票数 0

我有一个csv文件,如下所示:

代码语言:javascript
复制
start_date,end_date,pollster,sponsor,sample_size,population,party,subject,tracking,text,approve,disapprove,url
    2020-02-02,2020-02-04,YouGov,Economist,1500,a,all,Trump,FALSE,Do you approve or disapprove of Donald Trump’s handling of the coronavirus outbreak?,42,29,https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/73jqd6u5mv/econTabReport.pdf
    2020-02-02,2020-02-04,YouGov,Economist,376,a,R,Trump,FALSE,Do you approve or disapprove of Donald Trump’s handling of the coronavirus outbreak?,75,6,https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/73jqd6u5mv/econTabReport.pdf
    2020-02-02,2020-02-04,YouGov,Economist,523,a,D,Trump,TRUE,Do you approve or disapprove of Donald Trump’s handling of the coronavirus outbreak?,21,51,https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/73jqd6u5mv/econTabReport.pdf
    2020-02-02,2020-02-04,YouGov,Economist,599,a,I,Trump,,Do you approve or disapprove of Donald Trump’s handling of the coronavirus outbreak?,39,25,https://d25d2506sfb94s.cloudfront.net/cumulus_uploads/document/73jqd6u5mv/econTabReport.pdf
    2020-02-07,2020-02-09,Morning Consult,"",2200,a,all,Trump,TURE,Do you approve or disapprove of the job each of the following is doing in handling the spread of coronavirus in the United States? President Donald Trump,57,22,https://morningconsult.com/wp-content/uploads/2020/02/200214_crosstabs_CORONAVIRUS_Adults_v4_JB.pdf

我对值为“the”、"FALSE“或NAN的列"tracking”感兴趣

由于某种原因,当我用pandas阅读它时,所有的"tracking“列值都被加载为"False":

代码语言:javascript
复制
data = pd.read_csv("covid_approval_polls.csv")
data.head() 

start_date  end_date    pollster    sponsor     sample_size     population  party   subject     tracking    text    approve     disapprove  url
0   2020-02-02  2020-02-04  YouGov  Economist   1500.0  a   all     Trump   False   Do you approve or disapprove of Donald Trump’s...   42.0    29.0    https://d25d2506sfb94s.cloudfront.net/cumulus_...
1   2020-02-02  2020-02-04  YouGov  Economist   376.0   a   R   Trump   False   Do you approve or disapprove of Donald Trump’s...   75.0    6.0     https://d25d2506sfb94s.cloudfront.net/cumulus_...
2   2020-02-02  2020-02-04  YouGov  Economist   523.0   a   D   Trump   False   Do you approve or disapprove of Donald Trump’s...   21.0    51.0    https://d25d2506sfb94s.cloudfront.net/cumulus_...
3   2020-02-02  2020-02-04  YouGov  Economist   599.0   a   I   Trump   False   Do you approve or disapprove of Donald Trump’s...   39.0    25.0    https://d25d2506sfb94s.cloudfront.net/cumulus_...
4   2020-02-07  2020-02-09  Morning Consult     NaN     2200.0  a   all     Trump   False   Do you approve or disapprove of the job each o...   57.0    22.0    https://morningconsult.com/wp-content/uploads/.

。。

当我使用以下命令搜索该列的唯一值时:

代码语言:javascript
复制
data.tracking.unique()

我得到了正确的输出:

代码语言:javascript
复制
array([False, True, nan], dtype=object)

但是当我执行命令时:

代码语言:javascript
复制
print(data[data["tracking"] == "FALSE"])

我得到了:

代码语言:javascript
复制
Empty DataFrame
Columns: [start_date, end_date, pollster, sponsor, sample_size, population, party, subject, tracking, text, approve, disapprove, url]
Index: []

我很确定我在这里遗漏了什么,但不知道是什么导致了这个问题?我想获取基于列"tracking“值"FALSE”的行

EN

回答 1

Stack Overflow用户

发布于 2021-06-02 03:31:30

要强制键入,请使用dtype参数:

代码语言:javascript
复制
data = pd.read_csv("covid_approval_polls.csv", dtype={"tracking": str})
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/67794674

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档