然而,这是一个基本的问题,但是,我试图在Apache中使用Scala中的代码检索一个文件的内容,在Analytics上的Bluemix笔记本中,有关身份验证的错误不断出现。有人有用于访问文件的Scala身份验证示例吗?提前谢谢你!
我尝试了以下简单的脚本:
val file = sc.textFile("swift://notebooks.keystone/kdd99.data")
file.take(1)我也试过:
def setConfig(name:String) : Unit = {
val pfx = "fs.swift.service." + name
val conf = sc.getConf
conf.set(pfx + "auth.url", "hardcoded")
conf.set(pfx + "tenant", "hardcoded")
conf.set(pfx + "username", "hardcoded")
conf.set(pfx + "password", "hardcoded")
conf.set(pfx + "apikey", "hardcoded")
conf.set(pfx + "auth.endpoint.prefix", "endpoints")
}
setConfig("keystone")我还尝试了前面一个问题中的脚本:
import scala.collection.breakOut
val name= "keystone"
val YOUR_DATASOURCE = """auth_url:https://identity.open.softlayer.com
project: hardcoded
project_id: hardcoded
region: hardcoded
user_id: hardcoded
domain_id: hardcoded
domain_name: hardcoded
username: hardcoded
password: hardcoded
filename: hardcoded
container: hardcoded
tenantId: hardcoded
"""
val settings:Map[String,String] = YOUR_DATASOURCE.split("\\n").
map(l=>(l.split(":",2)(0).trim(), l.split(":",2)(1).trim()))(breakOut)
val conf = sc.getConf conf.set("fs.swift.service.keystone.auth.url",settings.getOrElse("auth_url",""))
conf.set("fs.swift.service.keystone.tenant", settings.getOrElse("tenantId", ""))
conf.set("fs.swift.service.keystone.username", settings.getOrElse("username", ""))
conf.set("fs.swift.service.keystone.password", settings.getOrElse("password", ""))
conf.set("fs.swift.service.keystone.apikey", settings.getOrElse("password", ""))
conf.set("fs.swift.service.keystone.auth.endpoint.prefix", "endpoints")
println("sett: "+ settings.getOrElse("auth_url",""))
val file = sc.textFile("swift://notebooks.keystone/kdd99.data")
/* The following line gives errors */
file.take(1)错误如下:
名称: org.apache.hadoop.fs.swift.exceptions.SwiftConfigurationException消息:缺少强制配置选项: fs.swift.service.keystone.auth.url
编辑
这将是Python的一个很好的替代方案。我尝试了以下两种不同的文件,并使用“火花”作为信任:
def set_hadoop_config(credentials):
prefix = "fs.swift.service." + credentials['name']
hconf = sc._jsc.hadoopConfiguration()
hconf.set(prefix + ".auth.url", credentials['auth_url']+'/v3/auth/tokens')
hconf.set(prefix + ".auth.endpoint.prefix", "endpoints")
hconf.set(prefix + ".tenant", credentials['project_id'])
hconf.set(prefix + ".username", credentials['user_id'])
hconf.set(prefix + ".password", credentials['password'])
hconf.setInt(prefix + ".http.port", 8080)
hconf.set(prefix + ".region", credentials['region'])
hconf.setBoolean(prefix + ".public", True)发布于 2016-05-14 21:44:35
要从Scala中的Object访问文件,以下命令序列可以在Scala笔记本中运行:(当您对笔记本的数据源中显示的文件执行“插入到代码”链接时,凭据将在单元格中填充):
IN1:
var credentials = scala.collection.mutable.HashMap[String, String](
"auth_url"->"https://identity.open.softlayer.com",
"project"->"object_storage_b3c0834b_0936_4bbe_9f29_ef45e018cec9",
"project_id"->"68d053dff02e42b1a947457c6e2e3290",
"region"->"dallas",
"user_id"->"e7639268215e4830a3662f708e8c4a5c",
"domain_id"->"2df6373c549e49f8973fb6d22ab18c1a",
"domain_name"->"639347",
"username"->"Admin_XXXXXXXXXXXX”,
"password”->”””XXXXXXXXXX”””,
"filename"->"2015_small.csv",
"container"->"notebooks",
"tenantId"->"sefe-f831d4ccd6da1f-42a9cf195d79"
)IN2
credentials("name")="keystone"IN3
def setHadoopConfig(name: String, tenant: String, url: String, username: String, password: String, region: String) = {
sc.hadoopConfiguration.set(f"fs.swift.service.$name.auth.url",url+"/v3/auth/tokens")
sc.hadoopConfiguration.set(f"fs.swift.service.$name.auth.endpoint.prefix","endpoints")
sc.hadoopConfiguration.set(f"fs.swift.service.$name.tenant",tenant)
sc.hadoopConfiguration.set(f"fs.swift.service.$name.username",username)
sc.hadoopConfiguration.set(f"fs.swift.service.$name.password",password)
sc.hadoopConfiguration.setInt(f"fs.swift.service.$name.http.port",8080)
sc.hadoopConfiguration.set(f"fs.swift.service.$name.region",region)
sc.hadoopConfiguration.setBoolean(f"fs.swift.service.$name.public",true)
}IN4
setHadoopConfig(credentials("name"), credentials("project_id"), credentials("auth_url"), credentials("user_id"), credentials("password"), credentials("region"))IN5
var testcount = sc.textFile("swift://notebooks.keystone/2015_small.csv")
testcount.count()在6
testcount.take(1)发布于 2016-05-14 19:07:50
我认为,在尝试从访问对象存储时,您需要使用"spark“作为信任名,而不是keystone。
sc.textFile("swift://notebooks.spark/2015_small.csv”)
下面是一个工作样本的例子。
token=37bff7ab682ee255b753fca485d49de50fed69d2a25217a7c748dd1463222c3b
注考虑根据对象存储更改容器名称。containername.configname。
还可以在上面的示例中替换YOUR_DATASOURCE变量中的凭据。
笔记本是默认的容器。
谢谢查尔斯。
https://stackoverflow.com/questions/37230356
复制相似问题