Talend HiveDB 连接需要cloudera SerDe

Talend HiveDB connection needs cloudera SerDe

我正在尝试将 Talend Open studio 连接到 Hive。

在 Hive 中,我有 table 自定义字段 (from cloudera-twitter-example)。
Open studio find table 没有问题,但如果我尝试检索架构,我会收到这样的错误:

在 talend 日志中我得到:

!ENTRY org.talend.platform.logging 4 0 2015-09-21 10:49:12.375
!MESSAGE 2015-09-21 10:49:12,372 ERROR org.talend.commons.exception.CommonExceptionHandler  - java.lang.reflect.InvocationTargetException

!STACK 0
java.sql.SQLException: java.lang.reflect.InvocationTargetException
    at org.talend.metadata.managment.hive.EmbeddedHiveDataBaseMetadata.getColumns(EmbeddedHiveDataBaseMetadata.java:401)
    at org.talend.core.model.metadata.builder.database.manager.ExtractManager.getColumnsResultSet(ExtractManager.java:844)
    at org.talend.core.model.metadata.builder.database.manager.ExtractManager.extractColumns(ExtractManager.java:641)
    at org.talend.core.model.metadata.builder.database.manager.ExtractManager.returnMetadataColumnsFormTable(ExtractManager.java:521)
    at org.talend.core.model.metadata.builder.database.ExtractMetaDataFromDataBase.returnMetadataColumnsFormTable(ExtractMetaDataFromDataBase.java:224)
    at org.talend.repository.ui.wizards.metadata.table.database.DatabaseTableForm.pressRetreiveSchemaButton(DatabaseTableForm.java:1150)
    at org.talend.repository.ui.wizards.metadata.table.database.DatabaseTableForm.access(DatabaseTableForm.java:1121)
    at org.talend.repository.ui.wizards.metadata.table.database.DatabaseTableForm.widgetSelected(DatabaseTableForm.java:795)
    at org.eclipse.swt.widgets.TypedListener.handleEvent(TypedListener.java:248)
    at org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:84)
    at org.eclipse.swt.widgets.Display.sendEvent(Display.java:4454)
    at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1388)
    at org.eclipse.swt.widgets.Display.runDeferredEvents(Display.java:3799)
    at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3409)
    at org.eclipse.jface.window.Window.runEventLoop(Window.java:832)
    at org.eclipse.jface.window.Window.open(Window.java:808)
    at org.talend.repository.metadata.ui.actions.metadata.AbstractCreateTableAction.handleWizard(AbstractCreateTableAction.java:139)
    at org.talend.repository.metadata.ui.actions.metadata.AbstractCreateTableAction.run(AbstractCreateTableAction.java:1088)
    at org.talend.repository.RepositoryWorkUnit.executeRun(RepositoryWorkUnit.java:93)
    at org.talend.core.repository.model.AbstractRepositoryFactory.executeRepositoryWorkUnit(AbstractRepositoryFactory.java:256)
    at org.talend.repository.localprovider.model.LocalRepositoryFactory.executeRepositoryWorkUnit(LocalRepositoryFactory.java:3227)
    at org.talend.core.repository.model.ProxyRepositoryFactory.executeRepositoryWorkUnit(ProxyRepositoryFactory.java:1996)
    at org.talend.repository.metadata.ui.actions.metadata.AbstractCreateTableAction.runInUIThread(AbstractCreateTableAction.java:1110)
    at org.eclipse.ui.progress.UIJob.run(UIJob.java:97)
    at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:35)
    at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:136)
    at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:3774)
    at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3412)
    at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine.run(PartRenderingEngine.java:1151)
    at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:332)
    at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine.run(PartRenderingEngine.java:1032)
    at org.eclipse.e4.ui.internal.workbench.E4Workbench.createAndRunUI(E4Workbench.java:148)
    at org.eclipse.ui.internal.Workbench.run(Workbench.java:636)
    at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:332)
    at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:579)
    at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:150)
    at org.talend.rcp.intro.Application.start(Application.java:183)
    at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
    at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:134)
    at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:104)
    at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:380)
    at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:235)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:648)
    at org.eclipse.equinox.launcher.Main.basicRun(Main.java:603)
    at org.eclipse.equinox.launcher.Main.run(Main.java:1465)
    at org.eclipse.equinox.launcher.Main.main(Main.java:1438)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.talend.metadata.managment.hive.EmbeddedHiveDataBaseMetadata.getColumns(EmbeddedHiveDataBaseMetadata.java:370)
    ... 49 more
Caused by: java.lang.RuntimeException: MetaException(message:java.lang.ClassNotFoundException Class com.cloudera.hive.serde.JSONSerDe not found)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:276)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:256)
    at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:595)
    at org.apache.hadoop.hive.ql.metadata.Table.getAllCols(Table.java:612)
    ... 54 more
Caused by: MetaException(message:java.lang.ClassNotFoundException Class com.cloudera.hive.serde.JSONSerDe not found)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:385)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:274)
    ... 57 more

最重要的信息是

Caused by: java.lang.RuntimeException: MetaException(message:java.lang.ClassNotFoundException Class com.cloudera.hive.serde.JSONSerDe not found)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:276)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:256)
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:595)
at org.apache.hadoop.hive.ql.metadata.Table.getAllCols(Table.java:612)
... 54 more
Caused by: MetaException(message:java.lang.ClassNotFoundException Class com.cloudera.hive.serde.JSONSerDe not found)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:385)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:274)
... 57 more

此消息表明 Talend Open Studio for big data 找不到 SerDe。
但我将 hive-serede 放在所有 hadoop 集群节点上,Informatica PowerCenter 可以从 hive 获取模式信息。
此外,当我在 knots 或 Hue 上使用 Hive CLI 对 HIVe 进行 运行 查询时,它可以正常工作,没有任何问题和额外的配置,例如

添加 JAR

如何让 Talend Open studio 与我的 Hive tables 一起工作?

我解决了这个问题。默认情况下,Talend Open Studio 尝试直接使用 Hive Metastore。
所谓的 embedded 连接。 (Metastore 端口为 9093)
但是在连接设置中,我看到端口 10000 指向 Hiveserver2(Thrift)。
切换到独立连接后开始工作。