猪脚本问题
Pig Script issues
我正在使用带有 Hcatalog 的 pig 从外部配置单元加载数据 table
我使用 pig -useHCatalog 输入 g运行t 并执行以下命令:
register 'datafu'
define Enumerate datafu.pig.bags.Enumerate('1');
imported_data = load 'hive external table' using org.apache.hive.hcatalog.pig.HCatLoader() ;
converted_data = foreach imported_data generate name,ip,domain,ToUnixTime(ToDate(dateandtime,'MM/dd/yyyy hh:mm:ss.SSS aa'))as unix_DateTime,date;
grouped = group converted_data by (name,ip,domain);
result = FOREACH grouped {
sorted = ORDER converted_data BY unix_DateTime;
sorted2 = Enumerate(sorted);
GENERATE FLATTEN(sorted2);
};
所有命令 运行 并提供所需的结果。
问题:
我用上面的命令制作了一个名为 pigFinal.pig 的猪脚本,并在本地文件系统中的本地模式 coz 脚本中执行了以下命令。
pig -useHCatalog -x local '/path/to/pigFinal.pig';
异常
Failed to generate logical plan. Nested exception:
org.apache.pig.backend.executionengine.ExecException: ERROR 1070:
Could not resolve datafu.pig.bags.Enumerate using imports: [,
java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at
org.apache.pig.parser.LogicalPlanBuilder.buildUDF(LogicalPlanBuilder.java:1507)
at
org.apache.pig.parser.LogicalPlanGenerator.func_eval(LogicalPlanGenerator.java:9372)
at
org.apache.pig.parser.LogicalPlanGenerator.projectable_expr(LogicalPlanGenerator.java:11051)
at
org.apache.pig.parser.LogicalPlanGenerator.var_expr(LogicalPlanGenerator.java:10810)
at
org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:10159)
at
org.apache.pig.parser.LogicalPlanGenerator.nested_command(LogicalPlanGenerator.java:16315)
at
org.apache.pig.parser.LogicalPlanGenerator.nested_blk(LogicalPlanGenerator.java:16116)
at
org.apache.pig.parser.LogicalPlanGenerator.foreach_plan(LogicalPlanGenerator.java:16024)
at
org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:15849)
at
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1933)
at
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 17 more
我在哪里需要为 pig 脚本注册 datafu jar?我想这就是问题所在。
请帮助
您必须确保 jar 文件位于与您的 pigscript 相同的文件夹中,或者确保在注册 jar 文件时在 pigscript 中提供正确的路径。所以在你的情况下
修改这个
register 'datafu'
至
-- If,lets say datafu-1.2.0.jar is your jar file and is located in the same folder as your pigscript then in your pigscript at the top have this
REGISTER datafu-1.2.0.jar
-- Else,lets say datafu-1.2.0.jar is your jar file and is located in the folder /usr/hadoop/lib then in your pigscript at the top have this
REGISTER /usr/hadoop/lib/datafu-1.2.0.jar
pig -useHCatalog \
-x local \
-Dpig.additional.jars="/local/path/to/datafu.jar:/local/path//other.jar" \
/path/to/pigFinal.pig;
或
在你的 pig 脚本中使用完全限定路径
register /local/path/to/datafu.jar;
我正在使用带有 Hcatalog 的 pig 从外部配置单元加载数据 table 我使用 pig -useHCatalog 输入 g运行t 并执行以下命令:
register 'datafu'
define Enumerate datafu.pig.bags.Enumerate('1');
imported_data = load 'hive external table' using org.apache.hive.hcatalog.pig.HCatLoader() ;
converted_data = foreach imported_data generate name,ip,domain,ToUnixTime(ToDate(dateandtime,'MM/dd/yyyy hh:mm:ss.SSS aa'))as unix_DateTime,date;
grouped = group converted_data by (name,ip,domain);
result = FOREACH grouped {
sorted = ORDER converted_data BY unix_DateTime;
sorted2 = Enumerate(sorted);
GENERATE FLATTEN(sorted2);
};
所有命令 运行 并提供所需的结果。
问题: 我用上面的命令制作了一个名为 pigFinal.pig 的猪脚本,并在本地文件系统中的本地模式 coz 脚本中执行了以下命令。
pig -useHCatalog -x local '/path/to/pigFinal.pig';
异常
Failed to generate logical plan. Nested exception: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve datafu.pig.bags.Enumerate using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.buildUDF(LogicalPlanBuilder.java:1507) at org.apache.pig.parser.LogicalPlanGenerator.func_eval(LogicalPlanGenerator.java:9372) at org.apache.pig.parser.LogicalPlanGenerator.projectable_expr(LogicalPlanGenerator.java:11051) at org.apache.pig.parser.LogicalPlanGenerator.var_expr(LogicalPlanGenerator.java:10810) at org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:10159) at org.apache.pig.parser.LogicalPlanGenerator.nested_command(LogicalPlanGenerator.java:16315) at org.apache.pig.parser.LogicalPlanGenerator.nested_blk(LogicalPlanGenerator.java:16116) at org.apache.pig.parser.LogicalPlanGenerator.foreach_plan(LogicalPlanGenerator.java:16024) at org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:15849) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1933) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 17 more
我在哪里需要为 pig 脚本注册 datafu jar?我想这就是问题所在。 请帮助
您必须确保 jar 文件位于与您的 pigscript 相同的文件夹中,或者确保在注册 jar 文件时在 pigscript 中提供正确的路径。所以在你的情况下
修改这个
register 'datafu'
至
-- If,lets say datafu-1.2.0.jar is your jar file and is located in the same folder as your pigscript then in your pigscript at the top have this
REGISTER datafu-1.2.0.jar
-- Else,lets say datafu-1.2.0.jar is your jar file and is located in the folder /usr/hadoop/lib then in your pigscript at the top have this
REGISTER /usr/hadoop/lib/datafu-1.2.0.jar
pig -useHCatalog \
-x local \
-Dpig.additional.jars="/local/path/to/datafu.jar:/local/path//other.jar" \
/path/to/pigFinal.pig;
或
在你的 pig 脚本中使用完全限定路径
register /local/path/to/datafu.jar;