首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何将odbc包安装到Databricks集群?

如何将odbc包安装到Databricks集群?
EN

Stack Overflow用户
提问于 2019-04-04 22:04:51
回答 1查看 2K关注 0票数 3

我需要从Databricks中的R笔记本访问Azure SQL数据库。为此,我打算使用odbc包,它可以很好地安装在我本地的R实例上。

我尝试使用Databricks的界面将包安装到集群中,但总是失败。我还在笔记本中尝试了以下代码:

代码语言:javascript
复制
install.packages("odbc")

这会导致:

代码语言:javascript
复制
Installing package into ‘/databricks/spark/R/lib’
(as ‘lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/odbc_1.1.6.tar.gz'
Content type 'application/x-gzip' length 288033 bytes (281 KB)
==================================================
downloaded 281 KB

* installing *source* package ‘odbc’ ...
** package ‘odbc’ successfully unpacked and MD5 sums checked
PKG_CFLAGS=
PKG_LIBS=-lodbc
<stdin>:1:17: fatal error: sql.h: No such file or directory
compilation terminated.
------------------------- ANTICONF ERROR ---------------------------
Configuration failed because odbc was not found. Try installing:
 * deb: unixodbc-dev (Debian, Ubuntu, etc)
 * rpm: unixODBC-devel (Fedora, CentOS, RHEL)
 * csw: unixodbc_dev (Solaris)
 * brew: unixodbc (Mac OSX)
To use a custom odbc set INCLUDE_DIR and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
--------------------------------------------------------------------
ERROR: configuration failed for package ‘odbc’
* removing ‘/databricks/spark/R/lib/odbc’

The downloaded source packages are in
    ‘/tmp/RtmpqHp2QM/downloaded_packages’

我也尝试过从github安装:

代码语言:javascript
复制
library(devtools)
devtools::install_github("r-dbi/odbc")

这给出了一个不同的错误:

代码语言:javascript
复制
Downloading GitHub repo r-dbi/odbc@master
Installing 3 packages: assertthat, BH, Rcpp
Installing packages into ‘/databricks/spark/R/lib’
(as ‘lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/assertthat_0.2.1.tar.gz'
Content type 'application/x-gzip' length 12742 bytes (12 KB)
==================================================
downloaded 12 KB

trying URL 'https://cloud.r-project.org/src/contrib/BH_1.69.0-1.tar.gz'
Content type 'application/x-gzip' length 12378154 bytes (11.8 MB)
==================================================
downloaded 11.8 MB

trying URL 'https://cloud.r-project.org/src/contrib/Rcpp_1.0.1.tar.gz'
Content type 'application/x-gzip' length 3661123 bytes (3.5 MB)
==================================================
downloaded 3.5 MB

* installing *source* package ‘assertthat’ ...
** package ‘assertthat’ successfully unpacked and MD5 sums checked
** R
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (assertthat)
* installing *source* package ‘BH’ ...
** package ‘BH’ successfully unpacked and MD5 sums checked
** inst
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (BH)
* installing *source* package ‘Rcpp’ ...
** package ‘Rcpp’ successfully unpacked and MD5 sums checked
** libs
g++  -I/usr/share/R/include -DNDEBUG -I../inst/include/     -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c Date.cpp -o Date.o
g++  -I/usr/share/R/include -DNDEBUG -I../inst/include/     -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c Module.cpp -o Module.o
g++  -I/usr/share/R/include -DNDEBUG -I../inst/include/     -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c Rcpp_init.cpp -o Rcpp_init.o
g++  -I/usr/share/R/include -DNDEBUG -I../inst/include/     -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c api.cpp -o api.o
g++  -I/usr/share/R/include -DNDEBUG -I../inst/include/     -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c attributes.cpp -o attributes.o
g++  -I/usr/share/R/include -DNDEBUG -I../inst/include/     -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c barrier.cpp -o barrier.o
g++ -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o Rcpp.so Date.o Module.o Rcpp_init.o api.o attributes.o barrier.o -L/usr/lib/R/lib -lR
installing to /databricks/spark/R/lib/Rcpp/libs
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (Rcpp)

The downloaded source packages are in
    ‘/tmp/RtmpqHp2QM/downloaded_packages’
Error in processx::run(bin, args = real_cmdargs, stdout_line_callback = real_callback(stdout),  : 
  System command error
In addition: Warning messages:
1: In install.packages("odbc") :
  installation of package ‘odbc’ had non-zero exit status
2: In install.packages("odbc") :
  installation of package ‘odbc’ had non-zero exit status

你知道为什么这个包不能安装在Databricks上,因为它在本地运行良好,而我尝试在Databricks上安装的所有其他包都使用相同的语法吗?

EN

回答 1

Stack Overflow用户

发布于 2019-04-05 20:03:14

访问SQL数据库的最佳选择是使用预先安装的JDBC连接(参见Documentation)。如果您想使用odbc,这需要(正如其中一条注释中提到的) unix ODBC。安装多个包的最佳实践是使用init-scripts。下面的python代码用于创建pyodbc安装的init-script。

代码语言:javascript
复制
script = """
  sudo apt-get -q -y install unixodbc unixodbc-dev
  sudo apt-get -q -y install python3-dev
  sudo pip install pyodbc
  curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
  sudo curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
  sudo apt-get update
  sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql
"""

dbutils.fs.put("/databricks/init/pyodbc/pyodbc.sh", script, True)

希望这能有所帮助。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/55517978

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档