我是hadoop和impala的新手。通过安装impyla并执行以下代码,我成功地连接到impala。这是通过LDAP连接:
from impala.dbapi import connect
from impala.util import as_pandas
conn = connect(host="server.lrd.com",port=21050, database='tcad',auth_mechanism='PLAIN', user="alexcj", use_ssl=True,timeout=20, password="secret1pass")然后,我可以抓取游标并执行查询,如下所示:
cursor = conn.cursor()
cursor.execute('SELECT * FROM tab_2014_m LIMIT 10')
df = as_pandas(cursor)我希望能够使用sqlalchemy连接到impala,并能够使用一些很好的sqlalchemy函数。我找到了a test file in imyla source code,它演示了如何使用impala驱动程序创建sqlalchemy引擎,如下所示:
engine = create_engine('impala://localhost')我希望能够做到这一点,但我不能这样做,因为我对上面的connect函数的调用有更多的参数;我不知道如何将这些参数传递给sqlalchemy的create_engine以获得成功的连接。有人这么做过吗?谢谢。
发布于 2016-11-07 19:47:59
正如在https://github.com/cloudera/impyla/issues/214上解释的那样
import sqlalchemy
def conn():
return connect(host='some_host',
port=21050,
database='default',
timeout=20,
use_ssl=True,
ca_cert='some_pem',
user=user, password=pwd,
auth_mechanism='PLAIN')
engine = sqlalchemy.create_engine('impala://', creator=conn)发布于 2017-09-25 10:21:47
import time
from sqlalchemy import create_engine, MetaData, Table, select, and_
ENGINE = create_engine(
'impala://{host}:{port}/{database}'.format(
host=host, # your host
port=port,
database=database,
)
)
METADATA = MetaData(ENGINE)
TABLES = {
'table': Table('table_name', METADATA, autoload=True),
}发布于 2021-12-22 18:58:13
如果您的Impala是由Kerberos保护的,下面的脚本是有效的(由于某种原因,我需要使用hive://而不是impala://)
import sqlalchemy
from sqlalchemy.engine import create_engine
connect_args={'auth': 'KERBEROS', 'kerberos_service_name': 'impala'}
engine = create_engine('hive://impalad-host:21050', connect_args=connect_args)
conn = engine.connect()
ResultProxy = conn.execute("SELECT * FROM db1.table1 LIMIT 5")
print(ResultProxy.fetchall())https://stackoverflow.com/questions/39582842
复制相似问题