数组不能使用Avro Extractor为空
使用EventHub和capture到Blob,我有一个基于AvroSamples的函数,该函数试图转换文件。
这是我的U脚本:
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [log4net];
REFERENCE ASSEMBLY [Avro];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
DECLARE @ABI_DATE string = "2017/10/17/"; //replace by ADF pipeline
DECLARE @input_file string = "wasb://archive@sa/namespace/eh/{*}/" + @ABI_DATE +"{*}/{*}/{*}";
DECLARE @output_file string = @"/output/" + @ABI_DATE + "extract.csv";
@rs =
EXTRACT
SequenceNumber long
,EnqueuedTimeUtc string
,Body byte[]
FROM @input_file
USING new Microsoft.Analytics.Samples.Formats.ApacheAvro.AvroExtractor(@"
{
""type"":""record"",
""name"":""EventData"",
""namespace"":""Microsoft.ServiceBus.Messaging"",
""fields"":[
{""name"":""SequenceNumber"",""type"":""long""},
{""name"":""Offset"",""type"":""string""},
{""name"":""EnqueuedTimeUtc"",""type"":""string""},
{""name"":""SystemProperties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes""]}},
{""name"":""Properties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes""]}},
{""name"":""Body"",""type"":[""null"",""bytes""]}
]
}
");
@cnt =
SELECT
SequenceNumber
,Encoding.UTF8.GetString(Body) AS Json //THIS LINE BREAKS !!!!
,EnqueuedTimeUtc
FROM @rs;
OUTPUT @cnt TO @output_file USING Outputters.Text();如果我运行相同的提取器,但是注释掉了Body字段,它就会像预期的那样工作。
这是一个错误:
用户表达式的内部异常:数组不能为空。参数名称:字节当前行转储: SequenceNumber: 4622 EnqueuedTimeUtc: NULL主体: NULL 计算表达式编码时出错。UTF8.GetString(Body)
发布于 2017-10-19 14:59:56
弗洛里安·曼德给了我一个解释:
提取器工作正常,您只是在不接受空作为输入的方法(Encoding.GetString)中传递空值(有意的,因为它在模式中)。但是,在您的最新解决方案中,您将丢失所有没有身体的记录。这是一个非技术性的决定,如果这是罚款或不。
因此,这是修复它的方法(使用WHERE子句)
@cnt =
SELECT
SequenceNumber
,Encoding.UTF8.GetString(Body) AS Json
,EnqueuedTimeUtc
FROM @rs
WHERE Body != null;https://stackoverflow.com/questions/46832887
复制相似问题