我正在尝试使用Apache-NiFi验证处理csv。我使用ExecuteGroovyScript处理csv并获取数据。
我原来的csv有一个问题,一些记录是这样的:
id,name,age,bd,email,address
1,sachith,29,9,sachith@email.com,{"No": "1","Lane":"Lane-1"}
2,nalaka,29,17,nalaka@email.com,{"No": "1","Lane":
"Lane-1"}这里的第二条记录是无效的,我只想删除这条记录并处理其余的。
import groovy.json.*
def ff=session.get()
if(!ff)return
def parser = new JsonSlurper().setType(JsonParserType.LAX)
ff.write{streamIn,streamOut->
streamIn.withReader('UTF-8'){r-> //convert in stream to reader
streamOut.withWriter('UTF-8'){w-> //convert out stream to writer
//go line by line
r.eachLine{line, lineNum->
if(lineNum==1){
w<<line<<'id,name,age,bd,email,address'<<'\n' //for the first line just add some headers
}else{
def row=line.split(',') //split line by coma
def json=row[5..-1].join(',') //join back to string starting from 3rd element
json = parser.parseText(json)
w<<"${json.id},${json.name},${json.age},${json.bd},${json.email},${json.address}"<<'\n'
}
}
}
}
}
REL_SUCCESS<<ff这是从my previous question上拍的。
基本上,我只想忽略记录,并使用其他值进行处理:
我已经推荐了:groovy.json.JsonException: expecting
但我不明白如何将其集成到Apache-NiFi flow中。
发布于 2020-02-12 22:06:08
我同意更好的解决问题的办法
但是,如果不可能,则可以尝试匹配一行是否完整
import groovy.json.*
def parser = new JsonSlurper().setType(JsonParserType.LAX) //LAX to accept strings without double-quotes
def w = System.out
def buf = new StringBuilder() //buffer to collect lines if they are not complete
new StringReader('''id,name,age,bd,email,address
1,sachith,29,9,sachith@email.com,{"No": "1","Lane":"Lane-1"}
2,nalaka,29,17,nalaka@email.com,{"No": "1"
,"Lane":"Lane-1"}''').withReader{r->
r.eachLine{line, lineNum->
if(lineNum==1){
w<<line<<'id,name,age,bd,email,address'<<'\n'
}else{
buf<<(buf?'\n':'')<<line //append line to previous incomplete line(s)
if(buf=~/(?s)^\d.*\}$/){
//normal line: starts with number and ends with }
def row=buf.toString().split(',') //split line by coma
def json=row[5..-1].join(',') //join back to string starting from 3rd element
json = parser.parseText(json)
w<<"${json.No},${json.Lane}"<<'\n'
buf.setLength(0) //reset buffer
}
}
}
}https://stackoverflow.com/questions/60187601
复制相似问题