我对PandaSQL非常陌生,以前从未用过它。这是我到目前为止的代码:
import pandas as pd
from pandasql import sqldf
import numpy as np
tasks = pd.read_csv("C:/Users/RMahesh/Documents/TASKS_Final_2.csv", encoding='cp1252')
query = """SELECT Work Item Id, Parent Work Item Id, MAX(Remaining Work)
FROM TASKS
GROUP BY Work Item Id, Parent Work Item Id;"""
df = sqldf(query, locals()))
print(df.head(5))我得到了这个错误:
'pandasql.sqldf.PandaSQLException: (sqlite3.OperationalError) near "Id": syntax error [SQL: 'SELECT Work Item Id, Parent Work Item Id, MAX(Remaining Work) \n'任何帮助都是最好的!
编辑:在实现了下面其他用户的一些建议后,以下是我的工作代码:
import pandas as pd
from pandasql import sqldf
import numpy as np
tasks = pd.read_csv("C:/Users/RMahesh/Documents/TASKS_Final_2.csv", encoding='cp1252', low_memory=False)
query = """SELECT [Work Item Id], [Parent Work Item Id], MAX([Remaining Work])
FROM tasks
GROUP BY [Work Item Id], [Parent Work Item Id];"""
print(sqldf(query, locals()))发布于 2018-06-13 02:38:14
如果您的列名包含空格,则必须用引号将它们引起来才能使SQL有效:
query = """SELECT `Work Item Id`, `Parent Work Item Id`, MAX(`Remaining Work`)
FROM TASKS
GROUP BY `Work Item Id`, `Parent Work Item Id`;"""或
query = """SELECT [Work Item Id], [Parent Work Item Id], MAX([Remaining Work])
FROM TASKS
GROUP BY [Work Item Id], [Parent Work Item Id];"""这取决于PandaSQL所期望的口味。
https://stackoverflow.com/questions/50823574
复制相似问题