这个问题在这里已经有答案了
:
如何在Spark SQL中使用连字符转义列名
(3个答案)
列名中包含破折号/连字符的PySpark Sql
(1个答案)
使用selectExpr选择其中包含特殊字符的spark dataframe列
(1个答案)
如何在SparkContext中处理dash的SQL请求
(1个答案)
3个月前就关门了。
我正在尝试基于多个CASE语句进行计数,使用PySpark sql将我的数据放入回收站。我使用以下代码:
df.createOrReplaceTempView("Counts")
df_count = spark.sql("""
SELECT
COUNT(CASE WHEN diff >= 1.0 AND diff < 2.0 THEN 1 END) as [1-2],
COUNT(CASE WHEN diff >= 2.0 AND diff < 3.0 THEN 1 END) as [2-3],
COUNT(CASE WHEN diff >= 3.0 AND diff < 4.0 THEN 1 END) as [3-4]
FROM
Counts""")
df_count.show()其中"diff“是我想要将其所有值放入1-2、2-3和3-4的数据框中的列。我的SQL代码在SQL Server中使用时工作正常,但在Spark.I中使用时就不能正常工作。我得到以下消息:
ParseException: "\nmismatched input '[' expecting (line 3, pos 84)\n\n== SQL ==\n\n有没有其他想法,我可以通过改变上面的SQL查询或者使用PySpark中的其他东西来实现我想要做的事情?我应该补充说,我没有访问我正在使用的集群上的Bucketizer的权限。
发布于 2020-11-30 14:42:08
尝试使用反引号转义括号:
df.createOrReplaceTempView("Counts")
df_count = spark.sql("""
SELECT
COUNT(CASE WHEN diff >= 1.0 AND diff < 2.0 THEN 1 END) as `[1-2]`,
COUNT(CASE WHEN diff >= 2.0 AND diff < 3.0 THEN 1 END) as `[2-3]`,
COUNT(CASE WHEN diff >= 3.0 AND diff < 4.0 THEN 1 END) as `[3-4]`
FROM Counts""")
df_count.show()https://stackoverflow.com/questions/65075579
复制相似问题