我正在尝试用pandasql::sqldf循环一个列表,但是这个sqldf似乎并没有捕获循环变量。下面是我的问题的一个程式化的概要:
import pandas as pd
from pandasql import sqldf
from datetime import datetime
FreqGamePlay = pd.DataFrame({'CONTACT_WID' : [1, 2, 3, 1, 4],
'TITLE_NOMIN_DT' : pd.to_datetime(['20130102', '20140103', '20120518',
'20140317', '20111123']),
'FreqGamePlay' : [12, 9, 22, 4, 5]})
FreqGamePlay = FreqGamePlay[['CONTACT_WID', 'TITLE_NOMIN_DT', 'FreqGamePlay']]
periodsList = ['2012-12-26', '2012-02-28']
for i in periodsList:
temp = sqldf("select CONTACT_WID, sum(FreqGamePlay) as FGP from FreqGamePlay where TITLE_NOMIN_DT > i group by CONTACT_WID;", globals())
print(temp)上面的程序给出了以下错误:
PandaSQLException:(sqlite3.OperationalError)没有这样的列:i SQL:‘从FreqGamePlay中选择CONTACT_WID,sum(FreqGamePlay)作为FGP,其中TITLE_NOMIN_DT >I按CONTACT_WID分组;’
但是,如果我用手对日期进行硬编码,它就能正常工作:
for i in periodsList:
temp = sqldf("select CONTACT_WID, sum(FreqGamePlay) as FGP from FreqGamePlay where TITLE_NOMIN_DT > '2012-12-26' group by CONTACT_WID;", globals())
print(temp)但是上面没有效率,因为实际的程序有一个更大的日期列表。如有任何建议,谢谢
发布于 2018-02-08 14:51:23
这是因为您在SQL字符串中直接包含了“i”变量,因此Python假定它是字符串的一部分,并且变量没有得到计算(您可以注意到,在错误消息中,i变量没有被它的值替换)。我建议您阅读一些关于使用Python字符串和变量的内容。在此之前,尝试如下:
for i in periodsList:
query = "select CONTACT_WID, sum(FreqGamePlay) as FGP from FreqGamePlay where TITLE_NOMIN_DT > '{}' group by CONTACT_WID;".format(i)
temp = sqldf(query, globals())花括号充当变量的占位符,格式()方法用于用变量值替换占位符。
https://stackoverflow.com/questions/48686424
复制相似问题