dask==2.27.0
python==3.8.3
Operating System== centos7如果可能的话,可以传递sqlalchemy文本,其中包含ddf.read_sql_table.
TypeError查询,当参数传递给read_sql_table时,如documentation.中所述
代码:
from sqlalchemy.sql import text
from sqlalchemy.engine import create_engine
import dask.dataframe as ddf
DIALECT = '<value>'
SQL_DRIVER= '<value>'
USERNAME= '<value>'
PASSWORD = '<value>'
HOSTNAME = '<value>'
PORT = '<value>'
SID = '<value>'
ENGINE_PATH = DIALECT + '+' + SQL_DRIVER + '://' + USERNAME + ':' + PASSWORD +'@' + HOSTNAME + ':' + str(PORT) + '/' + SID
s = text("My complicated sql query")
df = ddf.read_sql_table(s, ENGINE_PATH, index_col='id', npartitions=10)所见错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/synology/data/ganesh/arun/code_jan_27/art_rematching/venv_3.8_50/lib/python3.8/site-packages/dask/dataframe/io/sql.py", line 115, in read_sql_table
index = table.columns[index_col] if isinstance(index_col, str) else index_col
TypeError: 'method' object is not subscriptable发布于 2021-03-05 13:41:43
因此,dask目前不支持文本机制中的复杂查询(从v2021.02.0开始)。我的解决办法如下:
将查询保存为database
)
下面是一个简单的例子:
from sqlalchemy import Table, Metadata, Column, Integer
import dask.dataframe as ddf
import multiprocessing
uri = f'{dialect}://{user}:{password}@{host}:{port}/{dbName}'
view = '[NAME_OF_VIEW]'
schema = '[NAME_OF_SCHEMA]'
pkey = '[PRIMARY_KEY_COLUMN]'
myview = Table(view, Metadata(schema=schema), Column(pkey, Integer, primary_key=True))
df = ddf.read_sql_table(table=myview, uri=uri, index_col=pkey, schema=schema, npartitions=multiprocessing.cpu_count()*3)https://stackoverflow.com/questions/64040673
复制相似问题