首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在Python语言中,使用pandasql: query返回"Empty DataFrame“

在Python语言中,使用pandasql: query返回"Empty DataFrame“
EN

Stack Overflow用户
提问于 2019-03-21 13:49:07
回答 1查看 911关注 0票数 0

在Python语言中,使用pandasql: query返回"Empty DataFrame“

代码语言:javascript
复制
import pandas as pd
import sqlite3 as db
import pandasql
    dataSet = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data",header=None)
    type(dataSet)
    dataSet.columns = ['age', 'workclass','fnlwgt','education','education_num','marital_status','occupation','relationship'
                ,'race','sex','capital_gain','capital_loss','hours_per_week','native_country','salary']
    dataSet.head()
    from pandasql import sqldf
    q1 = "select distinct sex from dataSet where sex='Male';"
    pysqldf = lambda q: sqldf(q, globals())
    print(pysqldf(q1))
EN

回答 1

Stack Overflow用户

发布于 2019-03-21 16:26:16

对于这个数据集,我检查了实际数据,并在列中找到了空格。

因此,首先我们需要对数据执行清理,然后才能对其执行转换。

为了进行清理,我们需要修剪空白区域。为此,我编写了函数trim_all_the_columns,该函数将删除所有空格

上述数据集的代码

代码语言:javascript
复制
#!/usr/bin/env python
# coding: utf-8

#import required packages

import pandas as pd
import sqlite3 as db
import pandasql as ps
from pandasql import sqldf 

#give the path location from where data is loaded, in your case give "https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data"

inputpath=r'C:\Users\Z0040B9K\Desktop\SIGShowcase\adult.data.txt'

#Trim whitespace from ends of each value across all series in dataframe

def trim_all_the_columns(df):
    trim_strings = lambda x: x.strip() if type(x) is str else x
    return df.applymap(trim_strings)

#creating dataframe
dataSet = pd.read_csv(inputpath,header=None)

#calling trim function over the dataframe to remove all whitespaces
dataSet = trim_all_the_columns(dataSet)

type(dataSet)

dataSet.columns = ['age', 'workclass','fnlwgt','education','education_num','marital_status','occupation','relationship' ,'race','sex','capital_gain','capital_loss','hours_per_week','native_country','salary']

#sql query
q1 = "select distinct sex from dataSet where sex='Male';"
#this will return distinct result of **Male** column and that will be 0 
# If you add any other column like **age** or something else you will get result

#q1 = "select distinct age,sex from dataSet where sex='Male';"

pysqldf = lambda q: sqldf(q, globals()) 

#print result
print(pysqldf(q1))

#this also can be use to print result 

print(ps.sqldf(q1, locals()))

查找结果输出: for query

代码语言:javascript
复制
q1 = "select distinct age, sex from dataSet where sex='Male';"

查找查询的结果输出

代码语言:javascript
复制
 q1 = "select distinct sex from dataSet where sex='Male';"

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/55274448

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档