我是新来的火花流。当我在ALS上训练火花流的时候:这很糟糕。
java.lang.IllegalArgumentException:需求失败:没有MapPartitionsRDD4在randomSplit at streaming.scala:15\提供的评级
import org.apache.spark.mllib.recommendation.ALS\
import org.apache.spark.mllib.recommendation.Rating\
import org.apache.spark.SparkConf\
import org.apache.spark.SparkContext\
import org.apache.spark.streaming.{Seconds, StreamingContext}\
import org.apache.spark.streaming._\
object streaming {\
def main(args: Array[String]) {\
val conf = new SparkConf().setAppName("ALS").setMaster("local[2]")\
val ssc = new StreamingContext(conf, Seconds(1))\
val ratingStream = ssc.textFileStream(directory="/home/chiao/Downloads/streaming/").map(_.split(',') match {case Array(user,item,rate)=>Rating(user.toInt,item.toInt,rate.toInt)})\
val rank = 100\
val numIterations = 12\
val lambda = 0.01\
ratingStream.foreachRDD(ratingRDD => {val testTrain = ratingRDD.randomSplit(Array(0.3, 0.7))\
val model = ALS.train(testTrain(1), rank,numIterations, lambda)\
val test = testTrain(0).map {case Rating(subject, activity, freq) =>(subject, activity)}\
val prediction = model.predict(test)
})
ssc.start()
ssc.awaitTermination
}}发布于 2022-05-27 06:17:54
我的数据是: 1,10,100
1,12,100
1,13,100
2,10,100
2,11,100
2,13,100
3,10,100
3,12,100保存为user.txt在文件:/home/chiao/下载/流
https://stackoverflow.com/questions/72401049
复制相似问题