Hive 是大数据技术簇中进行数据仓库应用的基础组件,是其它类似数据仓库应用的对比基准。基础的数据操作可以通过脚本方式以 cli 进行处理;若需要开发应用程序,则需要使用 hive-jdbc 驱动进行连接。

0x00 添加 hive-site.xml 配置项

1
2
3
4
5
6
7
8
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>0.0.0.0</value>
</property>

0x01 启动 hiveserver2 服务

1
2
3
4
${HIVE_HOME}/bin/hive --service metastore &
${HIVE_HOME}/bin/hive --service hiveserver2 &

netstat -lnt | grep 10000

0x02 beeline 方式连接

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
shell>> ${HIVE_HOME}/bin/beeline
Beeline version 1.2.2 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/default
Connecting to jdbc:hive2://localhost:10000/default
Enter username for jdbc:hive2://localhost:10000/default: hadoop
Enter password for jdbc:hive2://localhost:10000/default: ******
Connected to: Apache Hive (version 1.2.2)
Driver: Hive JDBC (version 1.2.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000/default> show tables;
OK
+-------------------+--+
| tab_name |
+-------------------+--+
| managed_user |
| ods_user |
| partitioned_user |
+-------------------+--+
3 rows selected (0.358 seconds)
0: jdbc:hive2://localhost:10000/default> !quit
Closing: 0: jdbc:hive2://localhost:10000/default

0x03 jdbc 方式连接

建议使用 IntelliJ IDEA 开发工具,创建 Maven 工程,Java 代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
package com.data.hive;

import java.sql.SQLException;
import java.sql.DriverManager;
import java.sql.Connection;
import java.sql.Statement;
import java.sql.ResultSet;

public class HiveJDBC {

private static String DriverName = "org.apache.hive.jdbc.HiveDriver";

private static String ServerUrl = "jdbc:hive2://localhost:10000/default";
private static String ServerUser = "hadoop";
private static String ServerPwd = "hadoop";

public static void main(String args[]) throws SQLException {

try {
Class.forName(DriverName);
} catch (ClassNotFoundException e) {
e.printStackTrace();
System.exit(1);
}

Connection hiveConn = DriverManager.getConnection(ServerUrl, ServerUser, ServerPwd);
Statement hiveStmt = hiveConn.createStatement();

ResultSet hqlOutput = hiveStmt.executeQuery("show tables");
while (hqlOutput.next()) {
System.out.println(hqlOutput.getString(1));
}

hqlOutput.close();
hiveStmt.close();
hiveConn.close();
}
}

添加 Maven 依赖,pom.xml 依赖配置如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
<properties>
<project.build.sourceEncoding>UTF8</project.build.sourceEncoding>
<hadoop.version>2.7.7</hadoop.version>
<hive.version>1.2.2</hive.version>
</properties>

<repositories>
<repository>
<id>Apache Hadoop</id>
<name>Apache Hadoop</name>
<url>https://repo1.maven.org/maven2/</url>
</repository>
</repositories>

<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>${hive.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-metastore</artifactId>
<version>${hive.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>${hive.version}</version>
</dependency>
</dependencies>

大功告成!

参考文献

Hive Client
HiveServer2 Clients
通过JDBC连接hive