taoCMS是基于php+sqlite/mysql的国内最小(100Kb左右)的功能完善的CMS管理系统

hadoop与hive的安装配置教程

2013-06-08

hive是基于Hadoop的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供完整的sql查询功能,可以将sql语句转换为 MapReduce任务进行运行。 其优点是学习成本低,可以通过类SQL语句快速实现简单的MapReduce统计,不必开发专门的MapReduce应用,十分适合数据仓库的统计分析。

[网络环境设置]
vim /etc/hosts

192.168.100.52 hadoop1
192.168.99.34 hadoop2
192.168.103.135 hadoop3

分别到对应机器执行:

hostname hadoop1
hostname hadoop2
hostname hadoop3

[打通机器]

hadoop1# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
hadoop1# scp ~/.ssh/id_dsa.pub hadoop2:/root/
hadoop1# scp ~/.ssh/id_dsa.pub hadoop3:/root/
hadoop2# cat id_dsa.pub >> ~/.ssh/authorized_keys
hadoop3# cat id_dsa.pub >> ~/.ssh/authorized_keys

验证:从hadoop1登录到hadoop2和hadoop3,不再需要密码。

[安装hadoop]
确保所有机器有 ssh rsync jdk
确保设置了:
export JAVA_HOME=/opt/soft/jdk

hive在0.20.x的hadoop做了大量的测试,因此选择0.20

cd /opt/soft/
wget http://mirror.bjtu.edu.cn/apache/hadoop/core/hadoop-0.20.2/hadoop-0.20.2.tar.gz
tar -zxvf hadoop-0.20.2.tar.gz
cd hadoop-0.20.2/
vim .bashrc
export HADOOP_HOME=/opt/soft/hadoop-0.20.2

(重复以上操作到另外两机器)

[配置hadoop]
vim conf/core-site.xml
修改:




fs.default.name
hdfs://hadoop1:9000

vim conf/hdfs-site.xml
修改:




dfs.name.dir
/opt/hadoop/data/dfs.name.dir



dfs.data.dir
/opt/hadoop/data/dfs.data.dir



dfs.permissions
false

vim conf/mapred-site.xml
修改:




mapred.job.tracker
hadoop1:9001



mapred.system.dir
/opt/hadoop/system/mapred.system.dir



mapred.local.dir
/opt/hadoop/data/mapred.local.dir

vim masters

hadoop1

vim slaves

hadoop2
hadoop3

scp conf/* hadoop2:/opt/soft/hadoop-0.20.2/conf/
scp conf/* hadoop3:/opt/soft/hadoop-0.20.2/conf/


[初始化]

cd $HADOOP_HOME/bin
./hadoop namenode -format

启动
./start-all.sh

[验证]
$HADOOP_HOME/bin/hadoop dfs -ls /
打开 http://192.168.100.52:50030

http://192.168.100.52:50070


[搭建hive集群]

下载
只需要在hadoop1机器上安装

cd /opt/soft/hadoop-0.20.2
wget http://mirror.bjtu.edu.cn/apache/hive/hive-0.7.0/hive-0.7.0.tar.gz
tar zxvf hive-0.7.0.tar.gz
cd hive-0.7.0
vim ~/.bashrc
export HIVE_HOME=/opt/soft/hadoop-0.20.2/hive-0.7.0

$HIVE_HOME/bin/hive
>create table tt(id int,name string) row format delimited fields terminated by ',' collection items terminated by "n" stored as textfile;
>select * from tt;
>drop table tt;

试玩结束。

[配置hive]
准备mysql:hadoop1 user:hadoop pwd:hadoop

>create database hive
>GRANT all ON hive.* TO hadoop@% IDENTIFIED BY 'hadoop';
>FLUSH PRIVILEGES ;

vim $HIVE_HOME/conf/hive-site.xml


<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


hive.metastore.local
true


javax.jdo.option.ConnectionURL
jdbc:mysql://hadoop1:3306/hive?createDatabaseIfNotExist=true


javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver


javax.jdo.option.ConnectionUserName
hadoop


javax.jdo.option.ConnectionPassword
hadoop

[启动]
复制一个mysql-connector-java-5.1.10.jar到hive/lib下后:

$HIVE_HOME/bin/hive
>create table tt(id int,name string) row format delimited fields terminated by ',' collection items terminated by "n" stored as textfile;

如果报如下错:

FAILED: Error in metadata: javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) : Binary logging not possible. Message: Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode 'STATEMENT'

退出hive后,以root进入mysql执行:

>set global binlog_format='MIXED';

这是mysql的一个bug。

安装结束。

另外一篇文章《hive的安装和配置》:

http://blog.chinaunix.net/uid-451-id-3143781.html

类别:技术文章 | 阅读:252888 | 评论:0 | 标签:hadoop hive

想收藏或者和大家分享这篇好文章→

“hadoop与hive的安装配置教程”共有0条留言

发表评论

姓名:

邮箱:

网址:

验证码:

公告

taoCMS发布taoCMS 3.0.2(最后更新21年03月15日),请大家速速升级,欢迎大家试用和提出您宝贵的意见建议。

捐助与联系

☟请使用新浪微博联系我☟

☟在github上follow我☟

标签云