无法在配置单元中创建 table

Not able to create a table in hive

我正在尝试使用我在网上找到的以下模式在配置单元 3.0 中创建一个 table:

    CREATE TABLE tweets (
id BIGINT,
created_at STRING,
source STRING,
favorited BOOLEAN,
retweeted_status STRUCT< text : STRING, user : STRUCT<screen_name : STRING,name : STRING>, retweet_count : INT>,
entities STRUCT< urls : ARRAY<STRUT<expanded_url : STRING>>,
user_mentions : ARRAY<STRUCT<screen_name : STRING,name : STRING>>,
hashtags : ARRAY<STRUCT<text : STRING>>>,
text STRING,
user STRUCT< screen_name : STRING, name : STRING, friends_count : INT, followers_count : INT, statuses_count : INT, verified : BOOLEAN, utc_offset : INT, time_zone : STRING>, 
in_reply_to_screen_name STRING
) 
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JSONSerDe';

当我按下时输入 NoViableAltException。我是第一次使用配置单元,没有经验有人可以告诉我架构有什么问题吗?

UserReserved keyword 如果我们在 hive 中使用关键字那么我们需要 `(反引号)

将关键字括起来

示例:

`user`

尝试使用下面的创建 table 语句

    CREATE TABLE tweets (
    id BIGINT,
    created_at STRING,
    source STRING,
    favorited BOOLEAN,
    retweeted_status STRUCT< text : STRING, `user` : STRUCT<screen_name : STRING,name : STRING>, retweet_count : INT>,
    entities STRUCT< urls : ARRAY<STRUCT<expanded_url : STRING>>,
    user_mentions : ARRAY<STRUCT<screen_name : STRING,name : STRING>>,
    hashtags : ARRAY<STRUCT<text : STRING>>>,
    text STRING,
    `user` STRUCT< screen_name : STRING, name : STRING, friends_count : INT, followers_count : INT, statuses_count : INT, verified : BOOLEAN, utc_offset : INT, time_zone : STRING>, 
    in_reply_to_screen_name STRING
    ) 
    ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
    Location '/user/flume/tweets/';

我可以用上面的 ddl 创建 table:

desc tweets;
+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+--+
|         col_name         |                                                                     data_type                                                                     |      comment       |
+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+--+
| id                       | bigint                                                                                                                                            | from deserializer  |
| created_at               | string                                                                                                                                            | from deserializer  |
| source                   | string                                                                                                                                            | from deserializer  |
| favorited                | boolean                                                                                                                                           | from deserializer  |
| retweeted_status         | struct<text:string,user:struct<screen_name:string,name:string>,retweet_count:int>                                                                 | from deserializer  |
| entities                 | struct<urls:array<struct<expanded_url:string>>,user_mentions:array<struct<screen_name:string,name:string>>,hashtags:array<struct<text:string>>>   | from deserializer  |
| text                     | string                                                                                                                                            | from deserializer  |
| user                     | struct<screen_name:string,name:string,friends_count:int,followers_count:int,statuses_count:int,verified:boolean,utc_offset:int,time_zone:string>  | from deserializer  |
| in_reply_to_screen_name  | string                                                                                                                                            | from deserializer  |
+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+--+

UPDATE:

当我们 运行 select statement hive 在table 指向的目录(/user/hive/warehouse/tweets/) 然后根据您的 ddl 语句 读取这些数据,但在此目录中不存在案例数据,因此 select 语句未返回任何记录。

要解决此问题:

Option1. 将数据从 /user/flume/tweets/ 移动到 /user/hive/warehouse/tweets/ 目录然后您可以 select 来自 [=75] 的数据=].

`hadoop fs -mv /user/flume/tweets/  /user/hive/warehouse/tweets/`

(或)

Option2. 我们需要在 /user/flume/tweets/ 这个目录之上创建配置单元 table 然后你就可以在推文中看到数据 table(为此使用上面的 create table 语句)。