在 Talend 中合并多个 JSON 条目
Uniting multiple JSON entries in Talend
我正在尝试使用 tExtractJSONFields。在 JSON 文本中有多个作者,其 FirstName 和 LastName 在不同的标签下。我想将两者结合起来并分隔每个作者的名字,然后可以在 tLogRow 中看到。
我在tExtractJSONFields组件的"columns"部分勾选了Array的字段:
我从 tLogRow 获取输出为 [Han, Kamber], [Jiawei, Micheline]:
我要输出韩家伟;坎贝尔,米歇尔。
{
"d":{
"__type":"Response:http:\/\/research.microsoft.com",
"Author":null,
"Conference":null,
"Domain":null,
"Journal":null,
"Keyword":null,
"Organization":null,
"Publication":{
"__type":"PublicationResponse:http:\/\/research.microsoft.com",
"EndIdx":1,
"StartIdx":1,
"TotalItem":112686,
"Result":[
{
"__type":"Publication:http:\/\/research.microsoft.com",
"Abstract":null,
"Author":[
{
"__type":"Author:http:\/\/research.microsoft.com",
"Affiliation":null,
"CitationCount":0,
"DisplayPhotoURL":null,
"FirstName":"Jiawei",
"GIndex":0,
"HIndex":0,
"HomepageURL":null,
"ID":594572,
"LastName":"Han",
"MiddleName":"",
"NativeName":null,
"PublicationCount":0,
"ResearchInterestDomain":null
},
{
"__type":"Author:http:\/\/research.microsoft.com",
"Affiliation":null,
"CitationCount":0,
"DisplayPhotoURL":null,
"FirstName":"Micheline",
"GIndex":0,
"HIndex":0,
"HomepageURL":null,
"ID":2331044,
"LastName":"Kamber",
"MiddleName":"",
"NativeName":null,
"PublicationCount":0,
"ResearchInterestDomain":null
}
],
"CitationContext":null,
"CitationCount":5979,
"Conference":null,
"DOI":"",
"FullVersionURL":null,
"ID":694978,
"Journal":null,
"Keyword":[
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":9033,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":9972,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":22078,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":35009,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":36239,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":38375,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":40483,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":41259,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":73998,
"Name":null,
"PublicationCount":0
}
],
"ReferenceCount":160,
"Title":"Data Mining: Concepts and Techniques",
"Type":1,
"Year":2000
}
]
},
"ResultCode":0,
"Trend":null,
"Version":"1.1"
}
}
我可以使用 tJavaRow。将输入分开,因为它们是数组,然后合并相关索引。它非常复杂,但我不需要它。我正在寻找的信息已经存在于源中,我只需要重新路由即可。
我正在尝试使用 tExtractJSONFields。在 JSON 文本中有多个作者,其 FirstName 和 LastName 在不同的标签下。我想将两者结合起来并分隔每个作者的名字,然后可以在 tLogRow 中看到。
我在tExtractJSONFields组件的"columns"部分勾选了Array的字段:
我从 tLogRow 获取输出为 [Han, Kamber], [Jiawei, Micheline]:
我要输出韩家伟;坎贝尔,米歇尔。
{
"d":{
"__type":"Response:http:\/\/research.microsoft.com",
"Author":null,
"Conference":null,
"Domain":null,
"Journal":null,
"Keyword":null,
"Organization":null,
"Publication":{
"__type":"PublicationResponse:http:\/\/research.microsoft.com",
"EndIdx":1,
"StartIdx":1,
"TotalItem":112686,
"Result":[
{
"__type":"Publication:http:\/\/research.microsoft.com",
"Abstract":null,
"Author":[
{
"__type":"Author:http:\/\/research.microsoft.com",
"Affiliation":null,
"CitationCount":0,
"DisplayPhotoURL":null,
"FirstName":"Jiawei",
"GIndex":0,
"HIndex":0,
"HomepageURL":null,
"ID":594572,
"LastName":"Han",
"MiddleName":"",
"NativeName":null,
"PublicationCount":0,
"ResearchInterestDomain":null
},
{
"__type":"Author:http:\/\/research.microsoft.com",
"Affiliation":null,
"CitationCount":0,
"DisplayPhotoURL":null,
"FirstName":"Micheline",
"GIndex":0,
"HIndex":0,
"HomepageURL":null,
"ID":2331044,
"LastName":"Kamber",
"MiddleName":"",
"NativeName":null,
"PublicationCount":0,
"ResearchInterestDomain":null
}
],
"CitationContext":null,
"CitationCount":5979,
"Conference":null,
"DOI":"",
"FullVersionURL":null,
"ID":694978,
"Journal":null,
"Keyword":[
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":9033,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":9972,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":22078,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":35009,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":36239,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":38375,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":40483,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":41259,
"Name":null,
"PublicationCount":0
},
{
"__type":"Keyword:http:\/\/research.microsoft.com",
"CitationCount":0,
"ID":73998,
"Name":null,
"PublicationCount":0
}
],
"ReferenceCount":160,
"Title":"Data Mining: Concepts and Techniques",
"Type":1,
"Year":2000
}
]
},
"ResultCode":0,
"Trend":null,
"Version":"1.1"
}
}
我可以使用 tJavaRow。将输入分开,因为它们是数组,然后合并相关索引。它非常复杂,但我不需要它。我正在寻找的信息已经存在于源中,我只需要重新路由即可。