密码-
val configDetails2 = configDetails1
.join(skuDetails, configDetails1.col("sku_num") === skuDetails.col("sku") &&
configDetails1.col("ccn") === skuDetails.col("ccn"), "left_outer")
.select(
configDetails1.col("*"),
skuDetails.col("part"),
skuDetails.col("part_description"),
skuDetails.col("part_qty"))
.withColumn("item_name", when($"part".isNull, "DBNULL").otherwise($"part"))
.withColumn("item_description", when($"part_description".isNull, "DBNULL").otherwise($"part_description"))
.withColumn("item_qty", when($"part_qty".isNull, lit(0)).otherwise($"part_qty"))
.drop("part", "part_description", "part_qty")
val itemKey = configDetails2.select("item_name").rdd
val itemMaster = itemKey
.joinWithCassandraTable("dig_master", "item_master")
.select("buyer", "cfg_name".as("cfg"), "item", "ms_name".as("scheduler")).map(_._2) 错误-
java.lang.IllegalArgumentException:要求失败:重新排序失败({ccn#98,sku_num#54,sku#223,part#224,ccn#243},ArrayBuffer(sku_num,ccn,sku,part,ccn))不是({ccn#98,ccn#222,sku_num#54,sku#223,part#224,ccn#243},ArrayBuffer(sku_num,ccn,sku,part,ccn,sku,part,ccn,sku_num,ccn,sku,part,ccn)) ( org.apache.spark.sql.cassandra.execution.DSEDirectJoinStrategy.apply(DSEDirectJoinStrategy.scala:69) ).Iterator$$anon$12.hasNext(Iterator.scala:440)在scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)中的副产物在scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)(QueryPlanner.scala:74)在org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:66)(斯卡拉:77)( org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:74) )::(157) scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336)spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:92)前体(Iterator.scala:893)scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:89)在org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:89)中的副产物core.CollabStandardConfig$.delayedEndpoint$core$CollabStandardConfig$1(CollabStandardConfig.scala:185)的.sql.Dataset.rdd(Dataset.scala:2587)改性
无法找到对此错误的特定引用。任何帮助都是非常感谢的。
发布于 2019-05-02 17:58:01
你把scala版本2.10升级到2.11了吗?那就试试下面的选项,
val itemKey = configDetails2.select("item_name").rdd
val itemMaster = itemKey
.joinWithCassandraTable("dig_master", "item_master")
.select("buyer", "cfg_name".as("cfg"), "item", "ms_name".as("scheduler")).map(_._2) 将上面的代码更改为SQL作为数据框架,而不是将其转换为dataset。
https://stackoverflow.com/questions/55918744
复制相似问题