如何在groovy script/nifi 中将两行数据合并为单行?
How to merge two rows data into single row in groovy script/nifi?
我的数据是非结构化数据的形式,其中列的末尾存储在如下两行中。
UID|Name|ID|Mail
1|Ester|991|sd
gmail
2|Siva|992|siva
hotmail
3|Hari|993|hi gmail
数据中的一些行已经完成,但一些行要避免将这两行数据变成单行,如下所示。
UID|Name|ID|Mail
1|Ester|991|sd gmail
2|Siva|992|siva hotmail
3|Hari|993|hi gmail
我不知道哪个 nifi 处理器对这种转换有帮助。
但我已经尝试按照 Groovy 脚本来读取行,但无法找到将吐出的行合并为单行的方法。
def flowfile = session.get()
if(!flowfile)return
flowfile = session.write(flowfile, {rawIn, rawOut->
// ## transform streams into reader and writer
rawIn.withReader("UTF-8"){reader->
rawOut.withWriter("UTF-8"){writer->
reader.eachLine{line, lineNum->
if(!line.isEmpty())
{// ## let use regular expression to transform each line
writer << line << '\n'
}
}
}
}
} as StreamCallback)
session.transfer(flowfile, REL_SUCCESS)
任何人都可以建议我将我的数据转换为需求的想法吗?
我假设带有 headers 的第一行不能有新行符号并提供分隔符的数量
以下几行只是检查分隔符的数量并决定是否写新行。
但如果您在最后一列中有新行,则此算法将起作用...
代码片段:
def reader = new StringReader('''UID|Name|ID|Mail
1|Ester|991|sd
gmail
2|Siva|992|siva
hotmail
3|Hari|993|hi gmail''')
def writer = new StringWriter()
def delimCount = 0
reader.eachWithIndex{line,id->
if(id==0){
//let's count delims in header
delimCount = line.count('|')
//write header as is
writer << line
}else{
if( line.count('|')==delimCount ){
writer << '\n' //write new line
}else{
writer << ' ' //write space to continue previous line
}
writer << line
}
}
println writer.toString()
结果:
UID|Name|ID|Mail
1|Ester|991|sd gmail
2|Siva|992|siva hotmail
3|Hari|993|hi gmail
我的数据是非结构化数据的形式,其中列的末尾存储在如下两行中。
UID|Name|ID|Mail
1|Ester|991|sd
gmail
2|Siva|992|siva
hotmail
3|Hari|993|hi gmail
数据中的一些行已经完成,但一些行要避免将这两行数据变成单行,如下所示。
UID|Name|ID|Mail
1|Ester|991|sd gmail
2|Siva|992|siva hotmail
3|Hari|993|hi gmail
我不知道哪个 nifi 处理器对这种转换有帮助。
但我已经尝试按照 Groovy 脚本来读取行,但无法找到将吐出的行合并为单行的方法。
def flowfile = session.get()
if(!flowfile)return
flowfile = session.write(flowfile, {rawIn, rawOut->
// ## transform streams into reader and writer
rawIn.withReader("UTF-8"){reader->
rawOut.withWriter("UTF-8"){writer->
reader.eachLine{line, lineNum->
if(!line.isEmpty())
{// ## let use regular expression to transform each line
writer << line << '\n'
}
}
}
}
} as StreamCallback)
session.transfer(flowfile, REL_SUCCESS)
任何人都可以建议我将我的数据转换为需求的想法吗?
我假设带有 headers 的第一行不能有新行符号并提供分隔符的数量
以下几行只是检查分隔符的数量并决定是否写新行。
但如果您在最后一列中有新行,则此算法将起作用...
代码片段:
def reader = new StringReader('''UID|Name|ID|Mail
1|Ester|991|sd
gmail
2|Siva|992|siva
hotmail
3|Hari|993|hi gmail''')
def writer = new StringWriter()
def delimCount = 0
reader.eachWithIndex{line,id->
if(id==0){
//let's count delims in header
delimCount = line.count('|')
//write header as is
writer << line
}else{
if( line.count('|')==delimCount ){
writer << '\n' //write new line
}else{
writer << ' ' //write space to continue previous line
}
writer << line
}
}
println writer.toString()
结果:
UID|Name|ID|Mail
1|Ester|991|sd gmail
2|Siva|992|siva hotmail
3|Hari|993|hi gmail