我正在尝试使用databrick connect在IDE中运行我的databricks笔记本中的一些代码。我似乎想不出如何创建一个简单的数据帧。
使用:
import spark.implicits._
var Table_Count = Seq((cdpos_df.count(),I_count,D_count,U_count)).toDF("Table_Count","I_Count","D_Count","U_Count")给出错误消息value toDF is not a member of Seq[(Long, Long, Long, Long)]。
尝试从头开始创建数据帧:
var dataRow = Seq((cdpos_df.count(),I_count,D_count,U_count))
var schemaRow = List(
StructField("Table_Count", LongType, true),
StructField("I_Count", LongType, true),
StructField("D_Count", LongType, true),
StructField("U_Count", LongType, true)
)
var TableCount = spark.createDataFrame(
sc.parallelize(dataRow),
StructType(schemaRow)
)给出错误消息
overloaded method value createDataFrame with alternatives:
(data: java.util.List[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rdd: org.apache.spark.api.java.JavaRDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rdd: org.apache.spark.rdd.RDD[_],beanClass: Class[_])org.apache.spark.sql.DataFrame <and>
(rows: java.util.List[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.api.java.JavaRDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row],schema: org.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame
cannot be applied to (org.apache.spark.rdd.RDD[(Long, Long, Long, Long)], org.apache.spark.sql.types.StructType)发布于 2021-09-14 17:51:46
使用以下方法组合方法:
var TableCount = spark.createDataFrame(
sc.parallelize(dataRow)
// StructType(schemaRow)
).toDF("Table_Count","I_Count","D_Count","U_Count")摆脱了错误,但我仍然需要在一些地方构建它。
https://stackoverflow.com/questions/69181907
复制相似问题