猪脚本问题

Pig Script issues

我正在使用带有 Hcatalog 的 pig 从外部配置单元加载数据 table 我使用 pig -useHCatalog 输入 g运行t 并执行以下命令:

register 'datafu'

define Enumerate datafu.pig.bags.Enumerate('1');

imported_data  = load 'hive external table' using org.apache.hive.hcatalog.pig.HCatLoader() ;


converted_data = foreach imported_data generate name,ip,domain,ToUnixTime(ToDate(dateandtime,'MM/dd/yyyy hh:mm:ss.SSS aa'))as unix_DateTime,date;


grouped = group converted_data by (name,ip,domain);

result = FOREACH grouped {
             sorted = ORDER converted_data BY unix_DateTime;
             sorted2 = Enumerate(sorted);
             GENERATE FLATTEN(sorted2);
};

所有命令 运行 并提供所需的结果。

问题: 我用上面的命令制作了一个名为 pigFinal.pig 的猪脚本,并在本地文件系统中的本地模式 coz 脚本中执行了以下命令。

pig -useHCatalog -x local '/path/to/pigFinal.pig';

异常

Failed to generate logical plan. Nested exception: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve datafu.pig.bags.Enumerate using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.buildUDF(LogicalPlanBuilder.java:1507) at org.apache.pig.parser.LogicalPlanGenerator.func_eval(LogicalPlanGenerator.java:9372) at org.apache.pig.parser.LogicalPlanGenerator.projectable_expr(LogicalPlanGenerator.java:11051) at org.apache.pig.parser.LogicalPlanGenerator.var_expr(LogicalPlanGenerator.java:10810) at org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:10159) at org.apache.pig.parser.LogicalPlanGenerator.nested_command(LogicalPlanGenerator.java:16315) at org.apache.pig.parser.LogicalPlanGenerator.nested_blk(LogicalPlanGenerator.java:16116) at org.apache.pig.parser.LogicalPlanGenerator.foreach_plan(LogicalPlanGenerator.java:16024) at org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:15849) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1933) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 17 more

我在哪里需要为 pig 脚本注册 datafu jar?我想这就是问题所在。 请帮助

您必须确保 jar 文件位于与您的 pigscript 相同的文件夹中,或者确保在注册 jar 文件时在 pigscript 中提供正确的路径。所以在你的情况下

修改这个

register 'datafu'

-- If,lets say datafu-1.2.0.jar is your jar file and is located in the same folder as your pigscript then in your pigscript at the top have this
REGISTER datafu-1.2.0.jar 

-- Else,lets say datafu-1.2.0.jar is your jar file and is located in the folder /usr/hadoop/lib then in your pigscript at the top have this
REGISTER /usr/hadoop/lib/datafu-1.2.0.jar
pig -useHCatalog \
    -x local \
    -Dpig.additional.jars="/local/path/to/datafu.jar:/local/path//other.jar" \ 
    /path/to/pigFinal.pig;

在你的 pig 脚本中使用完全限定路径

register /local/path/to/datafu.jar;