我在AWS GLUE上使用Deequ,令人惊讶的是,当我要运行检查verificationSuite下面列出的hasMaxLength时。我得到了下面的错误,有人能帮助我吗?所有其他检查都通过/运行。它说check hasMaxLength不是amazon.deequ.checks的成员
下载:s3://stg-dev-ire- KLLParameters /jars/deequ./jar0.jar SCRIPT_URL = /tmp/g-6aa13d15270ba0853894d7d6f2d26459f810d2ab-4863242765668262538/script_2021-02-04-08-26-12.scala编译结果: /tmp/g-6aa13d15270ba0853894d7d6f2d26459f810d2ab-4863242765668262538/script_2021-02-04-08-26-12.scala:15:错误: object KLLParameters不是package com.amazon.deequ.analyzers导入com.amazon.deequ.analyzers的成员。{分析器,柱状图,模式,状态,/tmp/g-6aa13d15270ba0853894d7d6f2d26459f810d2ab-4863242765668262538/script_2021-02-04-08-26-12.scala:56:}^KLLParameters错误: value hasMaxLength不是com.amazon.deequ.checks.Check的成员可能的原因:可能在‘value hasMaxLength'?.hasMaxLength("* External Number",_==40) ^两个错误发现编译失败之前缺少分号。
代码如下:
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import com.amazon.deequ.analyzers.runners.{AnalysisRunner, AnalyzerContext}
import com.amazon.deequ.analyzers.runners.AnalyzerContext.successMetricsAsDataFrame
import com.amazon.deequ.{VerificationSuite, VerificationResult}
import com.amazon.deequ.VerificationResult.checkResultsAsDataFrame
import com.amazon.deequ.checks.{Check, CheckLevel}
import com.amazon.deequ.constraints.{ConstrainableDataTypes}
import org.apache.spark.sql.functions.{length, max}
object Deequ {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("dq")
val spark = SparkSession.builder().appName("dq").getOrCreate()
val dataset = spark.read.option("header",true).option("delimiter",",").csv("s3://ct-ire-
fin-stg-data-dev-raw-gib/templates /Contracts_And_Coverages/FPSL-
CONTRACTS=VALIDATIONS-v2 - Sheet1.csv")
val verificationResult: VerificationResult = { VerificationSuite()
// data to run the verification on
.onData(dataset)
// define a data quality check
.addCheck(
Check(CheckLevel.Error, "Template Validations")
.hasDataType("* External Number", ConstrainableDataTypes.String)
.hasMaxLength("* External Number", _==40)
.isComplete("* External Number")
.hasDataType("* Business Record Date",ConstrainableDataTypes.Integral )
.hasMax("* Business Record Date", _ == 8)
.isComplete("* Business Record Date")
.hasDataType("Organizational Unit (Owner)",ConstrainableDataTypes.String )
.hasMax("Organizational Unit (Owner)", _ == 10)
.isComplete("Organizational Unit (Owner)")
//failing
.isContainedIn("Organizational Unit (Owner)", Array("50000252","50000256","50000257"))
.hasDataType("Object Status",ConstrainableDataTypes.Integral )
.hasMax("Object Status", _ == 3)
.isComplete("Object Status")
.isContainedIn("Object Status", Array("0","1"))
.hasDataType("Delivery Package",ConstrainableDataTypes.String )
.hasMax("Delivery Package", _ == 20)
.isComplete("Delivery Package")
.hasDataType("Product Code",ConstrainableDataTypes.Integral)
.hasMax("Product Code", _ == 10)
.isComplete("Product Code")
.hasDataType("Source System Basic Data",ConstrainableDataTypes.String )
.hasMax("Source System Basic Data", _ == 10)
.isComplete("Source System Basic Data")
//LFST, CLPB, CLCB, CLHR, CCLU
//.isContainedIn("Source System Basic Data", Array("LSFT", "CLPB","CLCB","CLHR","CCLU"))
.isContainedIn("Source System Basic Data", Array("LSFT"))
.hasDataType("Production Control",ConstrainableDataTypes.String )
.hasMax("Production Control", _ == 20)
.isComplete("Production Control")
.isContainedIn("Production Control", Array("Z_UL_PR1"))
.hasDataType("Date of Start of Term",ConstrainableDataTypes.Integral )
.hasMax("Date of Start of Term", _ == 8)
.isComplete("Date of Start of Term")
.hasDataType("Date of End of Term",ConstrainableDataTypes.Integral )
.hasMax("Date of End of Term", _ == 8)
.isComplete("Date of End of Term")
.hasDataType("Legal Entity",ConstrainableDataTypes.Integral )
.hasMax("Legal Entity", _ == 10)
.isComplete("Legal Entity")
//2001
.isContainedIn("Legal Entity", Array("2001"))
.hasDataType("IFRS17 Category",ConstrainableDataTypes.Integral )
.hasMax("IFRS17 Category", _ == 80)
.isComplete("IFRS17 Category")
//To Change
.isContainedIn("IFRS17 Category", Array("1","2","3","4"))
.hasDataType("IFRS17 Portfolio",ConstrainableDataTypes.String )
.hasMax("IFRS17 Portfolio", _ == 40)
.isComplete("IFRS17 Portfolio")
)
// compute metrics and verify check conditions
.run()
}
//val metrics1 = successMetricsAsDataFrame(spark, analysisResult1)
val resultDataFrame = checkResultsAsDataFrame(spark, verificationResult)
resultDataFrame.write.mode("overwrite").parquet("s3://ct-ire-fin-stg-data-dev-raw-
gib/template_validations/template_validations_lifestyle/")
}
}发布于 2021-02-04 23:25:07
你的代码看起来不错,我看不出有什么明显的错误原因。我建议您验证以下内容:
匹配的deequ版本
您可能还想检查另一个导入错误来自哪里(error: object KLLParameters is not a member of package com.amazon.deequ.analyzers import com.amazon.deequ.analyzers.{Analyzer, Histogram, Patterns, State, KLLParameters})。解决此错误还可以帮助您解决hasMaxLength问题。
https://stackoverflow.com/questions/66042086
复制相似问题