从elasticsearch连接并读取数据到hive
Connection and read data from elasticsearch to hive
我想将 hive 连接到 elasticsearch。我按照说明 from here 进行了操作。
我执行以下步骤
1. start-dfs.sh
2. start-yarn.sh
3. launch elasticsearch
4. launch kibana
5. launch hive
inside hive
a- create a database
b- create a table
c- load data into the table (LOAD DATA LOCAL INPATH '/home/myuser/Documents/datacsv/myfile.csv' OVERWRITE INTO TABLE students; )
d- add jar /home/myuser/elasticsearch-hadoop-7.10.1/dist/elasticsearch-hadoop-hive-7.10.1.jar
e- create a table for Elastic.
create table students_es (stt int not null, mahocvien varchar(10), tenho string, ten string, namsinh date, gioitinh string, noisinh string, namvaodang date, trinhdochuyenmon string, hesoluong float, phucaptrachnhiem float, chucvudct string, chucdqh string, dienuutien int, ghichu int) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.nodes' = '127.0.0.1', 'es.port' = '9201', 'es.resource' = 'students/student');
f- insert overwrite table students_es select * from students;
那么我得到的错误如下
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org/apache/commons/httpclient/protocol/ProtocolSocketFactory
我使用了组件
基巴纳:7.10.1
蜂巢:3.1.2
hadoop: 3.1.2
我终于找到解决方法了。
您需要下载 jar 文件 commons-httpclient-3.1.jar 并将其放入
您的配置单元库目录。
我想将 hive 连接到 elasticsearch。我按照说明 from here 进行了操作。 我执行以下步骤
1. start-dfs.sh
2. start-yarn.sh
3. launch elasticsearch
4. launch kibana
5. launch hive
inside hive
a- create a database
b- create a table
c- load data into the table (LOAD DATA LOCAL INPATH '/home/myuser/Documents/datacsv/myfile.csv' OVERWRITE INTO TABLE students; )
d- add jar /home/myuser/elasticsearch-hadoop-7.10.1/dist/elasticsearch-hadoop-hive-7.10.1.jar
e- create a table for Elastic.
create table students_es (stt int not null, mahocvien varchar(10), tenho string, ten string, namsinh date, gioitinh string, noisinh string, namvaodang date, trinhdochuyenmon string, hesoluong float, phucaptrachnhiem float, chucvudct string, chucdqh string, dienuutien int, ghichu int) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.nodes' = '127.0.0.1', 'es.port' = '9201', 'es.resource' = 'students/student');
f- insert overwrite table students_es select * from students;
那么我得到的错误如下
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org/apache/commons/httpclient/protocol/ProtocolSocketFactory
我使用了组件 基巴纳:7.10.1 蜂巢:3.1.2 hadoop: 3.1.2
我终于找到解决方法了。 您需要下载 jar 文件 commons-httpclient-3.1.jar 并将其放入 您的配置单元库目录。