Hive GUDF 自定义解压缩功能测试用例因错误 "B cannot be cast to org.apache.hadoop.io.BytesWritable" 而失败
Hive GUDF custom unzip function test case failing with error "B cannot be cast to org.apache.hadoop.io.BytesWritable"
我正在为 Generic UDF 自定义解压缩评估函数编写测试用例,该函数可以解压缩 zip 文件。这个 jar 在 Hivequery 中使用。
这里是测试用例的代码,
public void testEvaluate() throws HiveException, IOException {
Unzip unzip = new Unzip();
File resourcesDirectory = new File("src/test/resources/test.zip");
byte[] bytes = Files.readAllBytes( resourcesDirectory.toPath() );
ObjectInspector binaryOI = PrimitiveObjectInspectorFactory.writableBinaryObjectInspector;
ObjectInspector[] arguments = {binaryOI};
unzip.initialize(arguments);
GenericUDF.DeferredObject valueObj0 = new GenericUDF.DeferredJavaObject(bytes);
GenericUDF.DeferredObject[] args = { valueObj0 };
unzip.evaluate(args );}
我收到如下错误,
java.lang.ClassCastException: [B cannot be cast to org.apache.hadoop.io.BytesWritable
at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableBinaryObjectInspector.getPrimitiveJavaObject(WritableBinaryObjectInspector.java:49)
at Unzip.evaluate(Unzip.java:32)
at UnzipTest.testEvaluate(UnzipTest.java:96)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.evaluate(ParentRunner.java:306)
at org.junit.runners.BlockJUnit4ClassRunner.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access0(ParentRunner.java:66)
at org.junit.runners.ParentRunner.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
从 DeferredObject[] args 读取字节时发生错误,
- byte[] input = elementOI.getPrimitiveJavaObject( arg[0].get() );
PS:test.zip 包含一个压缩到 test.zip
的文本文件(带有测试字符串)
您需要用 Hive 可以使用的可写对象包装 byte[]
,在您的情况下 BytesWritable
。
如您所见WritableBinaryObjectInspector.getPrimitiveJavaObject
expects BytesWritable
object as an input, not an array of bytes.
尝试代替
GenericUDF.DeferredObject valueObj0 = new GenericUDF.DeferredJavaObject(bytes);
执行以下操作:
GenericUDF.DeferredObject valueObj0 = new GenericUDF.DeferredJavaObject(new BytesWritable(bytes));
在本地复制您的案例我能够在 UDF evaluate
方法中成功检索 byte[]
。
我正在为 Generic UDF 自定义解压缩评估函数编写测试用例,该函数可以解压缩 zip 文件。这个 jar 在 Hivequery 中使用。 这里是测试用例的代码,
public void testEvaluate() throws HiveException, IOException {
Unzip unzip = new Unzip();
File resourcesDirectory = new File("src/test/resources/test.zip");
byte[] bytes = Files.readAllBytes( resourcesDirectory.toPath() );
ObjectInspector binaryOI = PrimitiveObjectInspectorFactory.writableBinaryObjectInspector;
ObjectInspector[] arguments = {binaryOI};
unzip.initialize(arguments);
GenericUDF.DeferredObject valueObj0 = new GenericUDF.DeferredJavaObject(bytes);
GenericUDF.DeferredObject[] args = { valueObj0 };
unzip.evaluate(args );}
我收到如下错误,
java.lang.ClassCastException: [B cannot be cast to org.apache.hadoop.io.BytesWritable
at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableBinaryObjectInspector.getPrimitiveJavaObject(WritableBinaryObjectInspector.java:49)
at Unzip.evaluate(Unzip.java:32)
at UnzipTest.testEvaluate(UnzipTest.java:96)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.evaluate(ParentRunner.java:306)
at org.junit.runners.BlockJUnit4ClassRunner.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access0(ParentRunner.java:66)
at org.junit.runners.ParentRunner.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
从 DeferredObject[] args 读取字节时发生错误,
- byte[] input = elementOI.getPrimitiveJavaObject( arg[0].get() );
PS:test.zip 包含一个压缩到 test.zip
的文本文件(带有测试字符串)您需要用 Hive 可以使用的可写对象包装 byte[]
,在您的情况下 BytesWritable
。
如您所见WritableBinaryObjectInspector.getPrimitiveJavaObject
expects BytesWritable
object as an input, not an array of bytes.
尝试代替
GenericUDF.DeferredObject valueObj0 = new GenericUDF.DeferredJavaObject(bytes);
执行以下操作:
GenericUDF.DeferredObject valueObj0 = new GenericUDF.DeferredJavaObject(new BytesWritable(bytes));
在本地复制您的案例我能够在 UDF evaluate
方法中成功检索 byte[]
。