首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >生成包含问题ID(作为列)和答案(作为它们的值)的考试评估报告的代码

生成包含问题ID(作为列)和答案(作为它们的值)的考试评估报告的代码
EN

Stack Overflow用户
提问于 2020-06-15 18:11:22
回答 2查看 29关注 0票数 0

我需要编写代码来生成考试评估报告(给定来自多个学生的答案和他们的尝试),其中包括问题ID(作为列)和答案(作为它们的值)。我需要注意的是,学生(参与者)可以在单个评估和地理标签中回答一个或多个问题。

代码语言:javascript
复制
     /** Input data */
    val inputDf = Seq(
    (1, "Question1Text", "Yes", "abcde1", 0, List("x1", "y1")),
    (2, "Question2Text", "No", "abcde1", 0, List("x1", "y1")),
    (3, "Question3Text", "3", "abcde1", 0, List("x1", "y1")),
    (1, "Question1Text", "No", "abcde2", 0, List("x2", "y2")),
    (2, "Question2Text", "Yes", "abcde2", 0, List("x2", "y2"))
    ).toDF("Qid", "Question", "AnswerText", "ParticipantID", "Assessment", "GeoTag")
    println("Input:")
    inputDf.show(false)

我的解决方案是:

代码语言:javascript
复制
     inputDf
    .groupBy($"ParticipantID")
    .pivot("Question")
    .agg(first($"ParticipantID"))
    .sort($"ParticipantID")

但我得到的是:

代码语言:javascript
复制
     Input:
     +---+-------------+----------+-------------+----------+--------+
     |Qid|Question     |AnswerText|ParticipantID|Assessment|GeoTag  |
     +---+-------------+----------+-------------+----------+--------+
     |1  |Question1Text|Yes       |abcde1       |0         |[x1, y1]|
     |2  |Question2Text|No        |abcde1       |0         |[x1, y1]|
     |3  |Question3Text|3         |abcde1       |0         |[x1, y1]|
     |1  |Question1Text|No        |abcde2       |0         |[x2, y2]|
     |2  |Question2Text|Yes       |abcde2       |0         |[x2, y2]|
     +---+-------------+----------+-------------+----------+--------+

     Expected:
     +-------------+----------+--------+-----+-----+-----+
     |ParticipantID|Assessment|GeoTag  |Qid_1|Qid_2|Qid_3|
     +-------------+----------+--------+-----+-----+-----+
     |abcde1       |0         |[x1, y1]|Yes  |No   |3    |
     |abcde2       |0         |[x2, y2]|No   |Yes  |null |
     +-------------+----------+--------+-----+-----+-----+

     Actual:
     +-------------+-------------+-------------+-------------+
     |ParticipantID|Question1Text|Question2Text|Question3Text|
     +-------------+-------------+-------------+-------------+
     |abcde1       |abcde1       |abcde1       |abcde1       |
     |abcde2       |abcde2       |abcde2       |null         |
     +-------------+-------------+-------------+-------------+
EN

回答 2

Stack Overflow用户

发布于 2020-06-15 18:45:09

你应该使用

代码语言:javascript
复制
inputDf
  .groupBy($"ParticipantID",  $"Assessment", $"GeoTag")
  .pivot("Question")
  .agg(first($"AnswerText"))
  .sort($"ParticipantID")

  .show(false)

您可以在以后根据需要重命名该列。

输出:

代码语言:javascript
复制
 +-------------+----------+--------+-------------+-------------+-------------+
|ParticipantID|Assessment|GeoTag  |Question1Text|Question2Text|Question3Text|
+-------------+----------+--------+-------------+-------------+-------------+
|abcde1       |0         |[x1, y1]|Yes          |No           |3            |
|abcde2       |0         |[x2, y2]|No           |Yes          |null         |
+-------------+----------+--------+-------------+-------------+-------------+
票数 2
EN

Stack Overflow用户

发布于 2020-06-15 18:47:54

也许这是有用的-

代码语言:javascript
复制
 /** Input data */
    val inputDf = Seq(
      (1, "Question1Text", "Yes", "abcde1", 0, List("x1", "y1")),
      (2, "Question2Text", "No", "abcde1", 0, List("x1", "y1")),
      (3, "Question3Text", "3", "abcde1", 0, List("x1", "y1")),
      (1, "Question1Text", "No", "abcde2", 0, List("x2", "y2")),
      (2, "Question2Text", "Yes", "abcde2", 0, List("x2", "y2"))
    ).toDF("Qid", "Question", "AnswerText", "ParticipantID", "Assessment", "GeoTag")
    println("Input:")
    inputDf.show(false)
    inputDf.printSchema()

 /**
      * Input:
      * +---+-------------+----------+-------------+----------+--------+
      * |Qid|Question     |AnswerText|ParticipantID|Assessment|GeoTag  |
      * +---+-------------+----------+-------------+----------+--------+
      * |1  |Question1Text|Yes       |abcde1       |0         |[x1, y1]|
      * |2  |Question2Text|No        |abcde1       |0         |[x1, y1]|
      * |3  |Question3Text|3         |abcde1       |0         |[x1, y1]|
      * |1  |Question1Text|No        |abcde2       |0         |[x2, y2]|
      * |2  |Question2Text|Yes       |abcde2       |0         |[x2, y2]|
      * +---+-------------+----------+-------------+----------+--------+
      *
      * root
      * |-- Qid: integer (nullable = false)
      * |-- Question: string (nullable = true)
      * |-- AnswerText: string (nullable = true)
      * |-- ParticipantID: string (nullable = true)
      * |-- Assessment: integer (nullable = false)
      * |-- GeoTag: array (nullable = true)
      * |    |-- element: string (containsNull = true)
      */

    inputDf
      .groupBy($"ParticipantID", $"Assessment", $"GeoTag")
      .pivot("Qid")
      .agg(
        first($"AnswerText").as("Q")
      )
      .orderBy($"ParticipantID")
      .show(false)

    /**
      * +-------------+----------+--------+---+---+----+
      * |ParticipantID|Assessment|GeoTag  |1  |2  |3   |
      * +-------------+----------+--------+---+---+----+
      * |abcde1       |0         |[x1, y1]|Yes|No |3   |
      * |abcde2       |0         |[x2, y2]|No |Yes|null|
      * +-------------+----------+--------+---+---+----+
      */
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62385976

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档