[关闭]
@hadoopMan 2017-05-15T03:48:42.000000Z 字数 10665 阅读 1545

Hive 安装部署及测试

hive
1) Hive 如何配置与Hadoop 关联,进行创建表,加载数据测试
2) 在Linux下安装MySQL数据库
3) 配置Hive元数据存储在MySQL中,查看相关元数据表信息
4) 熟悉基本的DML和DDL语句(创建数据库、表及加载数据和基本查询语句Select)


想学习spark,hadoop,kafka等大数据框架,请加群459898801,满了之后请加2群224209501。后续文章会陆续公开

1,相关文档及环境要求

1.1,相关文档

1.2,环境要求:

2,安装hive

2.1 解压eapache-hive-0.13.1-bin.tar.gz

  1. $tar -zxvf apache-hive-0.13.1-bin.tar.gz ./

2.2 设置HIVE_HOME为hive安装目录

  1. $ cd hive-0.13.1
  2. $ export HIVE_HOME={{pwd}}

2.3 添加HIVE_HOME到系统PATH中

  1. $ export PATH=$HIVE_HOME/bin:$PATH

3,运行hive

3.1 创建相关目录

  1. $bin/hdfs dfs -mkdir /tmp
  2. $bin/hdfs dfs -mkdir -p /user/hive/warehouse
  3. $bin/hdfs dfs -chmod g+w /tmp
  4. $bin/hdfs dfs -chmod g+w /user/hive/warehouse

3.2 修改配置文件

在hive-env.xml文件中添加如下内容:

  1. # Set HADOOP_HOME to point to a specific hadoop install directory
  2. HADOOP_HOME=/opt/modules/hadoop-2.5.0/
  3. # Hive Configuration Directory can be controlled by:
  4. export HIVE_CONF_DIR=/opt/modules/hive-0.13.1/conf

3.3 运行hive命令界面

(1).进入hive shell

  1. //hive安装目录
  2. $bin/hive

(2).创建table 名为student。

  1. >use default;
  2. >create table student(id int,name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
  3. >show tables;

执行结束后:
创建一张表.jpg-15.8kB
(2).上传数据

  1. $vi stu.txt
  2. //内容:
  3. 1001 sean
  4. 1002 jike
  5. 1003 tony
  6. //上传数据到表中
  7. >load data local inpath '/opt/datas/stu.txt' into table student;

(3).查看表中数据

  1. >select * from student;

执行结束后:
查看表中数据.jpg-9.3kB
(4).查看表中某一个列的数据

  1. >select id from student;

执行结束后:
查看ID.jpg-47.6kB

3.4 运行 HiveServer2 and Beeline

  1. //hive安装目录
  2. $bin/hiveserver2
  3. $bin/beeline -u jdbc:hive2://$HS2_HOST:$HS2_PORT

4,安装mySQL

4.1 联网安装,使用yum

设置自动获取IP地址

4.2 替换系统镜像源

  1. $ cd /etc/yum.repos.d
  2. $ sudo mv CentOS-Base.repo CentOS-Base.repo.bak
  3. $ sudo touch CentOS6-Base-163.repo
  4. $ sudo vi CentOS6-Base-163.repo
  5. //安装完成后执行
  6. $ sudo yum clean all

4.3 安装MYSQL

  1. $ sudo yum list|grep mysql
  2. $ sudo yum install mysql-server -y

4.4 启动Mysql

  1. $ sudo service mysqld status
  2. $ sudo service mysqld start

4.5 设置密码

  1. # /usr/bin/mysqladmin -u root password '123456'

4.8 设置开机启动mysqld

  1. $sudo chkconfig mysqld on
  2. $sudo chkconfig --list |grep mysqld

执行结束后
开机启动mysql.png-5.9kB

4.7 测试

  1. $ mysql -uroot -p123456
  2. > show databases ;
  3. > use test ;
  4. > show tables ;

5,hive常用属性配置

5.1 Hive hive-site.xml配置

首先创建vi hive-site.xml,并输入任意字符。

  1. <configuration>
  2. <property>
  3. <name>javax.jdo.option.ConnectionURL</name>
  4. <value>jdbc:mysql://hadoop-miao.host.com/metastore_db?createDatabaseIfNotExist=true</value>
  5. <description>JDBC connect string for a JDBC metastore</description>
  6. </property>
  7. <property>
  8. <name>javax.jdo.option.ConnectionDriverName</name>
  9. <value>com.mysql.jdbc.Driver</value>
  10. <description>Driver class name for a JDBC metastore</description>
  11. </property>
  12. <property>
  13. <name>javax.jdo.option.ConnectionUserName</name>
  14. <value>root</value>
  15. <description>username to use against metastore database</description>
  16. </property>
  17. <property>
  18. <name>javax.jdo.option.ConnectionPassword</name>
  19. <value>123456</value>
  20. <description>password to use against metastore database</description>
  21. </property>

5.2 添加数据库驱动

  1. $tar -zxf mysql-connector-java-5.1.27.tar.gz
  2. $cd mysql-connector-java-5.1.27
  3. $cp mysql-connector-java-5.1.27-bin.jar /opt/modules/hive-0.13.1/lib/

5.3 配置用户链接权限

  1. $mysql -uroot -p123456
  2. >use mysql;
  3. >select User,Host,Password from user;
  4. >update user set Host='%' where User = 'root' and Host='localhost';
  5. >delete from user where user='root' and host='127.0.0.1';
  6. >delete from user where user='root' and host='miaodonghua.host';
  7. >delete from user where host='localhost';
  8. >delete from user where host='hadoop-miao.host.com';
  9. >flush privileges;
  10. update user set host='%' where host='localhost';
  11. flush privileges;
  12. //查看修改结果
  13. >select User,Host,Password from user;

执行结束后
修改链接权限结果.jpg-15.6kB

5.4 Hive数据仓库位置配置 

默认是:/user/hive/warehouse,可按照如下条件修改:

  1. <property>
  2. <name>hive.metastore.warehouse.dir</name>
  3. <value>/user/hadoop/warehouse</value>
  4. <description>location of default database for the warehouse</description>
  5. </property>
  6. 注意权限:
  7. $bin/hdfs dfs -mkdir -p /user/hive/warehouse
  8. $bin/hdfs dfs -chmod g+w /user/hive/warehouse

5.5 Hive运行日志信息位置

  1. 打开hive-log4j.properties,由模板hive-log4j.properties.template修改
  2. $vi $HIVE_HOME/conf/log4j.properties
  3. //系统/tmp目录
  4. hive.log.dir=${java.io.tmpdir}/${user.name}
  5. hive.log.file=hive.log
  6. 可修改为:
  7. hive.log.dir=/opt/modules/hive-0.13.1/logs
  8. hive.log.file=hive.log

5.6 指定hive运行时显示的log日志的级别

  1. $HIVE_HOME/conf/log4j.properties
  2. 默认是hive.root.logger=INFO,DRFA
  3. 可修改为hive.root.logger=DEBUG,DRFA
  4. DEBUG调试时可用

5.7 在cli命令行上显示当前数据库,以及查询表的行头信息

  1. $HIVE_HOME/conf/hive-site.xml
  2. //添加如下内容
  3. <property>
  4. <name>hive.cli.print.header</name>
  5. <value>true</value>
  6. </property>
  7. <property>
  8. <name>hive.cli.print.current.db</name>
  9. <value>true</value>
  10. </property>

5.8 在启动hive时设置配置属性信息

  1. $bin/hive --hiveconf <property=value>
  2. eg:
  3. $bin/hive --hiveconf hive.cli.print.current.db=false
注意:此种方式设置的属性值,仅仅在当前回话session生效.

5.9 hive配置优先级的关系如下:

  1. --hiveconf > hive-site.xml > hive-default.xml

5.10 查看当前所有的配置信息

  1. > set ;
  2. > set hive.cli.print.header ; ## 获取属性值
  3. > set hive.cli.print.header = false ; ## 设置属性值

总共四种方式设置属性值:

  1. set > --hiveconf > hive-site.xml > hive-default.xml

6,Hive Shell常用操作

6.1 Hive 类sql语句常用操作

(1).创建表并查询

  1. >show databases ;
  2. >create database db_hive ;
  3. >use db_hive ;
  4. >create table student(id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ;
  5. >load data inpath '/user/hive/warehouse/student/stu.txt' into table student ;
  6. >select * from student ;
  7. >select id from student ;

上述指令执行成功后,secureCRT结果如下:
创建查看hive中的表信息.png-39.2kB
可以在浏览器miaodonghua1.host:8088端口查看:
8088端口查看.png-25kB

(2).查看表的描述信息

  1. >desc student ;
  2. >desc extended student ;
  3. >desc formatted student ;

执行成功后:
desc卡看表的描述信息.png-37.9kB

(3).函数的使用

  1. >show functions ;
  2. >desc function upper ;
  3. >desc function extended upper ;
  4. >select name, upper(name) upper_name from student ;

执行结束后:
函数upper的使用.png-29.4kB

6.2 在mysql中查看hive元数据信息

hive元数据信息具体如下:
hive元数据信息描述.png-45.6kB

  1. >use metastore;
  2. >select * from TBLS;
  3. >select * from COLUMN_NAME_V2;
  4. >select * from DBS

6.3 Hive Shell常用操作

(1).查看命令帮助

  1. $ bin/hive -help
  2. usage: hive
  3. -d,--define <key=value> Variable subsitution to apply to hive
  4. commands. e.g. -d A=B or --define A=B
  5. --database <databasename> Specify the database to use
  6. -e <quoted-query-string> SQL from command line
  7. -f <filename> SQL from files
  8. -H,--help Print help information
  9. -h <hostname> connecting to Hive Server on remote host
  10. --hiveconf <property=value> Use value for given property
  11. --hivevar <key=value> Variable subsitution to apply to hive
  12. commands. e.g. --hivevar A=B
  13. -i <filename> Initialization SQL file
  14. -p <port> connecting to Hive Server on port number
  15. -S,--silent Silent mode in interactive shell
  16. -v,--verbose Verbose mode (echo executed SQL to the
  17. console)

(2).登录指定数据库

  1. $bin/hive --database db_hive

登录指定数据库.png-6.3kB

(3).直接查询指定数据库的指定表

  1. $bin/hive -e "select * from db_hive.student ;"

查看指定表.png-12.4kB

(4).执行hive脚本

  1. $bin/hive -f stu.sql
  2. --stu.sql脚本内容
  3. use db_hive ;
  4. select *from student ;

执行sql脚本.png-13.1kB

(5).重定向输出结果

  1. $bin/hive -f stu.sql > /opt/datas/hivef-res.txt

重定向输出结果到文件.png-17.5kB

6.4 Hive常用交互式操作

  1. >quit/exit ;//退出
  2. >set key=value ;//修改特定变量的值
  3. //注意: 如果变量名拼写错误,不会报错
  4. >set/set -v;//
  5. >! <command>;//从Hive shell执行一个shell命令代表执行本地命令
  6. >dfs <dfs command> ;//从Hive shell执行一个dfs命令
  7. >query string ;//执行一个Hive查询,然后输出结果到标准输出

错误修改

  1. <property>
  2. <name>hive.metastore.uris</name>
  3. <value>thrift://miaodonghua1.host:9083</value>
  4. </property>

启动metaStore

  1. nohup hive --service metastore>/data/hive.out 2>/data/hive.log &

配置这个要是不启动metastore服务,会报如下错误:

  1. Logging initialized using configuration in file:/opt/cdh2.3.6/hive-0.13.1-cdh5.3.6/conf/hive-log4j.properties
  2. Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
  3. at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:371)
  4. at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
  5. at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
  6. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  7. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  8. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  9. at java.lang.reflect.Method.invoke(Method.java:606)
  10. at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
  11. Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
  12. at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1426)
  13. at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:63)
  14. at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:73)
  15. at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2625)
  16. at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2644)
  17. at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:365)
  18. ... 7 more
  19. Caused by: java.lang.reflect.InvocationTargetException
  20. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  21. at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  22. at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  23. at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  24. at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1424)
  25. ... 12 more
  26. Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
  27. at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
  28. at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:351)
  29. at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:219)
  30. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  31. at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  32. at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  33. at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  34. at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1424)
  35. at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:63)
  36. at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:73)
  37. at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2625)
  38. at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2644)
  39. at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:365)
  40. at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
  41. at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
  42. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  43. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  44. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  45. at java.lang.reflect.Method.invoke(Method.java:606)
  46. at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
  47. Caused by: java.net.ConnectException: Connection refused
  48. at java.net.PlainSocketImpl.socketConnect(Native Method)
  49. at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
  50. at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
  51. at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
  52. at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
  53. at java.net.Socket.connect(Socket.java:579)
  54. at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
  55. ... 19 more
  56. )
  57. at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:398)
  58. at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:219)
  59. ... 17 more

hive配置的三种模式

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注