Hadoop社区不提供64位编译好的版本,只能用源码自行编译64位版本。学习一项技术从安装开始,学习Hadoop要从编译开始。

0x00 前言

本文档编译的hadoop版本是hadoop-2.6.4-src.tar.gz

重要提示:源码包编译的官方文档是压缩包的根目录下BUILDING.txt说明书。

Build instructions for Hadoop

安装问题的引入:Hadoop社区不提供64位编译好的版本,只能用源码自行编译64位版本。学习一项技术从安装开始,学习Hadoop要从编译开始。

0x01 编译环境说明

操作系统:Red Hat Enterprise Linux Server release 6.5 (Santiago)
核心信息:Kernel 2.6.32-431.el6.x86_64 on an x86_64

0x02 安装系统支持包

1
2
3
4
5
yum -y install autoconf automake libtool cmake
yum -y install ncurses-devel
yum -y install openssl-devel
yum -y install lzo-devel zlib-devel gcc gcc-c++
yum -y install gcc gcc-c++ make

0x03 组件安装

将所有组件包上传到主机/usr/local/src/目录下。

(1)安装JDK

注意:只能用1.7,否则编译会出错。

1
2
3
4
5
6
tar zxvf jdk-7u80-linux-x64.tar.gz -C /opt/modules/
vi /etc/profile
export JAVA_HOME=/opt/modules/jdk1.7.0_80
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

(2)安装Maven

1
2
3
4
5
tar zxvf apache-maven-3.3.1-bin.tar.gz -C /opt/modules/
vi /etc/profile
export MAVEN_HOME=/opt/modules/apache-maven-3.3.1
export PATH=$PATH:$MAVEN_HOME/bin
vi /opt/modules/apache-maven-3.3.1/conf/settings.xml

更改maven资料库,在里添加如下内容:

1
2
3
4
5
6
<mirror>
<id>nexus-osc</id>
<mirrorOf>*</mirrorOf>
<name>Nexus osc</name>
<url>http://maven.oschina.net/content/groups/public/</url>
</mirror>

<profiles></profiles>内新添加:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
<profile>
<id>jdk-1.7</id>
<activation>
<jdk>1.7</jdk>
</activation>
<repositories>
<repository>
<id>nexus</id>
<name>local private nexus</name>
<url>http://maven.oschina.net/content/groups/public/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>nexus</id>
<name>local private nexus</name>
<url>http://maven.oschina.net/content/groups/public/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</pluginRepository>
</pluginRepositories>
</profile>

如果不是第一次编译,可以配置本地仓库:
<localRepository>/path/to/local/repo</localRepository>

(3)安装Findbugs

1
2
3
4
tar zxvf findbugs-3.0.1.tar.gz -C /opt/modules/
vi /etc/profile
export FINDBUGS_HOME=/opt/modules/findbugs-3.0.1
export PATH=$PATH:$FINDBUGS_HOME/bin

(4)安装ProtocolBuffer

1
2
3
4
5
6
7
tar xvf protobuf-2.5.0.tar.gz
cd protobuf-2.5.0
./configure --prefix=/opt/modules/protobuf
make
make install
ldconfig
protoc --version

(5)上网

由于编译Hadoop过程中,Maven需要下载依赖库,所以必须保证主机能上网。

(6)安装Snappy(可选)

0x04 源码编译

1
2
3
4
5
6
7
8
9
10
java -version
mvn -version
findbugs -version
protoc --version
ping www.baidu.com

tar zxvf hadoop-2.6.4-src.tar.gz -C /opt/
cd hadoop-2.6.4-src/
export MAVEN_OPTS="-Xms256m -Xmx512m"
mvn clean package -Pdist,native,docs -DskipTests -Dtar

剩下的就交给电脑,人可以出去锻炼身体了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Hadoop Main ................................. SUCCESS [06:32 min]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [03:46 min]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [01:29 min]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [ 0.329 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [07:17 min]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [ 57.156 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [06:10 min]
[INFO] Apache Hadoop Auth ................................. SUCCESS [05:18 min]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 27.119 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [09:30 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 6.677 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [04:45 min]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [ 0.040 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [13:03 min]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [04:10 min]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [01:44 min]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 5.085 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [ 0.033 s]
[INFO] hadoop-yarn ........................................ SUCCESS [ 0.059 s]
[INFO] hadoop-yarn-api .................................... SUCCESS [01:25 min]
[INFO] hadoop-yarn-common ................................. SUCCESS [01:29 min]
[INFO] hadoop-yarn-server ................................. SUCCESS [ 0.095 s]
[INFO] hadoop-yarn-server-common .......................... SUCCESS [ 42.211 s]
[INFO] hadoop-yarn-server-nodemanager ..................... SUCCESS [01:51 min]
[INFO] hadoop-yarn-server-web-proxy ....................... SUCCESS [ 2.956 s]
[INFO] hadoop-yarn-server-applicationhistoryservice ....... SUCCESS [ 7.023 s]
[INFO] hadoop-yarn-server-resourcemanager ................. SUCCESS [ 23.905 s]
[INFO] hadoop-yarn-server-tests ........................... SUCCESS [ 45.162 s]
[INFO] hadoop-yarn-client ................................. SUCCESS [ 8.784 s]
[INFO] hadoop-yarn-applications ........................... SUCCESS [ 0.047 s]
[INFO] hadoop-yarn-applications-distributedshell .......... SUCCESS [ 2.790 s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher ..... SUCCESS [ 2.169 s]
[INFO] hadoop-yarn-site ................................... SUCCESS [ 0.052 s]
[INFO] hadoop-yarn-registry ............................... SUCCESS [ 5.526 s]
[INFO] hadoop-yarn-project ................................ SUCCESS [ 5.919 s]
[INFO] hadoop-mapreduce-client ............................ SUCCESS [ 0.083 s]
[INFO] hadoop-mapreduce-client-core ....................... SUCCESS [ 25.201 s]
[INFO] hadoop-mapreduce-client-common ..................... SUCCESS [ 19.914 s]
[INFO] hadoop-mapreduce-client-shuffle .................... SUCCESS [ 3.998 s]
[INFO] hadoop-mapreduce-client-app ........................ SUCCESS [ 11.686 s]
[INFO] hadoop-mapreduce-client-hs ......................... SUCCESS [ 8.481 s]
[INFO] hadoop-mapreduce-client-jobclient .................. SUCCESS [ 28.587 s]
[INFO] hadoop-mapreduce-client-hs-plugins ................. SUCCESS [ 1.978 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 6.412 s]
[INFO] hadoop-mapreduce ................................... SUCCESS [ 4.931 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 44.811 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 8.613 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [ 2.769 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 6.654 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 5.080 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [ 3.253 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [ 2.646 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [ 4.990 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [ 8.460 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [ 5.232 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [06:09 min]
[INFO] Apache Hadoop Client ............................... SUCCESS [ 8.045 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [ 0.145 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 7.135 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 12.856 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [ 0.027 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [02:45 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:27 h
[INFO] Finished at: 2016-11-05T15:02:45+08:00
[INFO] Final Memory: 112M/369M
[INFO] ------------------------------------------------------------------------

编译成功后会打包,放在hadoop-dist/target目录下。

dist_target_dir

hadoop-2.6.4.tar.gz 就是编译成功的二进制安装包,大功告成!

参考文献

数据仓库的初级手册