我在将数据从Azure Blob存储csv文件导入到我的Spark by Jupyter笔记本时遇到了问题。我正在尝试实现一个关于ML和Spark的教程。当我像这样填满Jupyter笔记本时:
import sqlContext.implicits._
val flightDelayTextLines = sc.textFile("wasb://sparkcontainer@[my account].blob.core.windows.net/sparkcontainer/Scored_FlightsAndWeather.csv")
case class AirportFlightDelays(OriginAirportCode:String,OriginLatLong:String,Month:Integer,Day:Integer,Hour:Integer,Carrier:String,DelayPredicted:Integer,DelayProbability:Double)
val flightDelayRowsWithoutHeader = flightDelayTextLines.map(s => s.split(",")).filter(line => line(0) != "OriginAirportCode")
val resultDataFrame = flightDelayRowsWithoutHeader.map(
s => AirportFlightDelays(
s(0), //Airport code
s(13) + "," + s(14), //Lat,Long
s(1).toInt, //Month
s(2).toInt, //Day
s(3).toInt, //Hour
s(5), //Carrier
s(11).toInt, //DelayPredicted
s(12).toDouble //DelayProbability
)
).toDF()
resultDataFrame.write.mode("overwrite").saveAsTable("FlightDelays") 我收到如下错误:
SparkSession available as 'spark'.
<console>:23: error: not found: value sqlContext
import sqlContext.implicits._
^我也使用了短路径,比如("wasb:///sparkcontainer/Scored_FlightsAndWeather.csv")同样的错误。有什么想法吗?BR,Marek
发布于 2018-12-03 16:40:38
当我看到您的代码片段时,我看不到sqlContext已创建,请参考以下代码并创建sqlContext,然后开始使用它。
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._

https://stackoverflow.com/questions/53571815
复制相似问题