首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Spark截断Spark平面

Spark截断Spark平面
EN

Stack Overflow用户
提问于 2019-05-02 20:50:43
回答 2查看 1.6K关注 0票数 8

我面临以下问题:在打印已执行的计划时,我无法查看所有推送的过滤器。

执行的代码是

代码语言:javascript
复制
println(df.queryExecution.executedPlan.treeString(true))

所有计划都会打印出来,并且在Pushed字段中如下所示

代码语言:javascript
复制
 PushedFilters: [IsNotNull(X1), IsNotNull(X2), IsNotNull(X2), IsNotNull(X3..., ReadSchema: 

正如您可能注意到的,它并没有完全打印出来。此外,为了解决这个问题,我修改了spark-default.conf文件中的以下属性

代码语言:javascript
复制
spark.debug.maxToStringFields    120000

不幸的是,前一种方法并没有解决问题。

对如何克服这个问题有什么建议吗?

EN

回答 2

Stack Overflow用户

发布于 2020-10-15 11:28:47

从Spark 3.0.1开始,它目前被硬编码为最多100个字符的[12],但它是最近推出的fixed,新引入的配置密钥spark.sql.maxMetadataStringLength默认为100。

票数 5
EN

Stack Overflow用户

发布于 2019-05-03 00:27:53

你可以做df.explain(true),它将输出整个计划:

代码语言:javascript
复制
== Parsed Logical Plan ==
'SerializeFromObject [validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, x), IntegerType) AS x#67, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, y), IntegerType) AS y#68, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, z), IntegerType) AS z#69]
+- 'MapElements <function1>, interface org.apache.spark.sql.Row, [StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)], obj#66: org.apache.spark.sql.Row
   +- 'DeserializeToObject unresolveddeserializer(createexternalrow(getcolumnbyordinal(0, IntegerType), getcolumnbyordinal(1, IntegerType), getcolumnbyordinal(2, IntegerType), StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false))), obj#65: org.apache.spark.sql.Row
      +- Filter isnull(y#9)
         +- Filter (x#8 = 0)
            +- Project [_1#4 AS x#8, _2#5 AS y#9, _3#6 AS z#10]
               +- SerializeFromObject [assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._1 AS _1#4, assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._2 AS _2#5, assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._3 AS _3#6]
                  +- ExternalRDD [obj#3]

== Analyzed Logical Plan ==
x: int, y: int, z: int
SerializeFromObject [validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, x), IntegerType) AS x#67, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, y), IntegerType) AS y#68, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, z), IntegerType) AS z#69]
+- MapElements <function1>, interface org.apache.spark.sql.Row, [StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)], obj#66: org.apache.spark.sql.Row
   +- DeserializeToObject createexternalrow(x#8, y#9, z#10, StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)), obj#65: org.apache.spark.sql.Row
      +- Filter isnull(y#9)
         +- Filter (x#8 = 0)
            +- Project [_1#4 AS x#8, _2#5 AS y#9, _3#6 AS z#10]
               +- SerializeFromObject [assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._1 AS _1#4, assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._2 AS _2#5, assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._3 AS _3#6]
                  +- ExternalRDD [obj#3]

== Optimized Logical Plan ==
SerializeFromObject [validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, x), IntegerType) AS x#67, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, y), IntegerType) AS y#68, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, z), IntegerType) AS z#69]
+- MapElements <function1>, interface org.apache.spark.sql.Row, [StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)], obj#66: org.apache.spark.sql.Row
   +- DeserializeToObject createexternalrow(x#8, y#9, z#10, StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)), obj#65: org.apache.spark.sql.Row
      +- LocalRelation <empty>, [x#8, y#9, z#10]

== Physical Plan ==
*(1) SerializeFromObject [validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, x), IntegerType) AS x#67, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, y), IntegerType) AS y#68, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, z), IntegerType) AS z#69]
+- *(1) MapElements <function1>, obj#66: org.apache.spark.sql.Row
   +- *(1) DeserializeToObject createexternalrow(x#8, y#9, z#10, StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)), obj#65: org.apache.spark.sql.Row
      +- LocalTableScan <empty>, [x#8, y#9, z#10]
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/55952889

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档