首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用pyspark中的格式将数组传递给SQL查询

使用pyspark中的格式将数组传递给SQL查询
EN

Stack Overflow用户
提问于 2021-03-09 04:41:51
回答 1查看 153关注 0票数 0

我想通过将concepts的值作为参数值传递给UDF has_any_concept来执行以下查询。

以下是环境中的内容

代码语言:javascript
复制
concepts
代码语言:javascript
复制
['CREATININE_QUANTITATIVE_24_HOUR_DIALYSIS_FLUID_OBSTYPE',
 'CREATININE_QUANTITATIVE_24_HOUR_URINE_OBSTYPE',
 'CREATININE_QUANTITATIVE_SERUM_OBSTYPE']

这是不传递参数的查询。

代码语言:javascript
复制
(spark.sql("""
select 
   
   resultCode.standard.primaryDisplay                                           as display
   
   from results 
   WHERE has_any_concept(resultCode, array("CREATININE_QUANTITATIVE_24_HOUR_DIALYSIS_FLUID_OBSTYPE","CREATININE_QUANTITATIVE_24_HOUR_URINE_OBSTYPE","CREATININE_QUANTITATIVE_SERUM_OBSTYPE"))
   
   LIMIT 3
""".format(concepts = concepts))\
   .toPandas()
)
代码语言:javascript
复制
display
0   Creatinine [Mass/volume] in Serum or Plasma
1   Creatinine [Mass/volume] in Serum or Plasma
2   Creatinine [Mass/volume] in Serum or Plasma

这也是可行的

代码语言:javascript
复制
(spark.sql("""
select 
   
   resultCode.standard.primaryDisplay                                           as display,
   ontologicalCategoryAliases                                                   as category
   
   from results 
   WHERE has_any_concept(resultCode, array("{concepts[0]}","{concepts[1]}","{concepts[2]}"))
   
   LIMIT 3
""".format(concepts = concepts))\
   .toPandas()
)
代码语言:javascript
复制
display     category
0   Creatinine [Mass/volume] in Serum or Plasma     [LABS_OBSTYPE]
1   Creatinine [Mass/volume] in Serum or Plasma     [LABS_OBSTYPE]
2   Creatinine [Mass/volume] in Serum or Plasma     [LABS_OBSTYPE]

这不起作用

代码语言:javascript
复制
(spark.sql("""
select 
   
   resultCode.standard.primaryDisplay                                           as display,
   ontologicalCategoryAliases                                                   as category
   
   from results 
   WHERE has_any_concept(resultCode, array({concepts}))
   
   LIMIT 3
""".format(concepts = [''' "{concept}"   '''.format(concept = concept) for concept in concepts]))\
   .toPandas()
)
代码语言:javascript
复制
ParseException: '\nmismatched input \'from\' expecting <EOF>(line 7, pos 3)\n\n== SQL ==\n\nselect \n   \n   resultCode.standard.primaryDisplay                                           as display,\n   ontologicalCategoryAliases                                                   as category\n   \n   from results \n---^^^\n   WHERE has_any_concept(resultCode, array([\' "CREATININE_QUANTITATIVE_24_HOUR_DIALYSIS_FLUID_OBSTYPE"   \', \' "CREATININE_QUANTITATIVE_24_HOUR_URINE_OBSTYPE"   \', \' "CREATININE_QUANTITATIVE_SERUM_OBSTYPE"   \']))\n   AND normalizedValue.typedValue.type = "NUMERIC" \n   AND interpretation.standard.primaryDisplay NOT IN (\'Not applicable\', \'Normal\')\n   \n   LIMIT 10\n'

我没有写UDF has_any_concepts

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-03-09 13:06:53

如果您使用的是python 3.6+,那么如果您使用f-strings,代码看起来会更整洁一些。

在SQL语法中,不能直接将列表传递给数组函数。

代码语言:javascript
复制
spark.sql(
    f"""
    select 
       resultCode.standard.primaryDisplay as display,
       ontologicalCategoryAliases as category
    from results 
    WHERE has_any_concept(resultCode, array({", ".join([f"'{x}'" for x in concepts])}))
    LIMIT 3
    """
).toPandas()
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66536908

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档