睿抗大数据学习笔记
LinstarsHive3安装
手动指定仓库镜像
装完centos需要先配置仓库镜像
如果网络正常但官方镜像站不可用,可以临时替换为其他镜像源(如阿里云、清华源):
备份原有仓库配置
1 2
| mkdir /etc/yum.repos.d/backup mv /etc/yum.repos.d/CentOS-*.repo /etc/yum.repos.d/backup/
|
下载阿里云镜像源配置
1
| curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
|
清理并重建 YUM 缓存
1
| yum clean all && yum makecache
|
Mysql安装
卸载Centos7自带的mariadb
1 2 3 4 5 6 7 8 9 10 11
| #检查系统是否安装了 MariaDB [root@node01 ~]# rpm -qa|grep mariadb mariadb-libs-5.5.68-1.el7.x86_64
#解除依赖 [root@node01 ~]# rpm -e mariadb-libs-5.5.68-1.el7.x86_64 --nodeps
#查看是否已经卸载 [root@node01 ~]# rpm -qa|grep mariadb [root@node01 ~]# #无回显说明已经卸载完成
|
安装mysql
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| mkdir /export/software/mysql e #上传mysql到上诉文件夹下,也就是mysql文件夹下 #解压压缩包 [root@node01 mysql]# tar -xvf mysql-5.7.29-1.el7.x86_64.rpm-bundle.tar #执行安装 [root@node01 mysql]# yum -y install libaio 已加载插件:fastestmirror, langpacks Loading mirror speeds from cached hostfile * base: mirrors.aliyun.com * extras: mirrors.aliyun.com * updates: mirrors.aliyun.com 软件包 libaio-0.3.109-13.el7.x86_64 已安装并且是最新版本 无须任何处理
[root@node01 mysql]# rpm -ivh mysql-community-common-5.7.29-1.el7.x86_64.rpm mysql-community-libs-5.7.29-1.el7.x86_64.rpm mysql-community-client-5.7.29-1.el7.x86_64.rpm mysql-community-server-5.7.29-1.el7.x86_64.rpm 警告:mysql-community-common-5.7.29-1.el7.x86_64.rpm: 头V3 DSA/SHA1 Signature, 密钥 ID 5072e1f5: NOKEY 准备中... ################################# [100%] 正在升级/安装... 1:mysql-community-common-5.7.29-1.e################################# [ 25%] 2:mysql-community-libs-5.7.29-1.el7################################# [ 50%] 3:mysql-community-client-5.7.29-1.e################################# [ 75%] 4:mysql-community-server-5.7.29-1.e################################# [100%]
|
只需要安装以上4个mysql服务即可
mysql初始化设置
1 2 3 4 5 6 7 8 9 10 11
| #初始化 mysqld --initialize
#更改所属组 chown mysql:mysql /var/lib/mysql -R
#启动mysql systemctl start mysqld.service
#查看生成的临时root密码 cat /var/log/mysqld.log
|
修改root密码 授权远程访问 设置开机自启动
1 2
| #更新root面膜 设置为123456 ALTER USER 'root'@'localhost' IDENTIFIED BY '123456';
|
hive安装
上传apache压缩包解压(目录/export/server)
1 2 3 4
| [root@node01 server]# tar -zxvf apache-hive-3.1.2-bin.tar.gz [root@node01 server]# ls apache-hive-3.1.2-bin apache-hive-3.1.2-bin.tar.gz [root@node01 server]#
|
上传hadoop-3.1.3.tar压缩包解压(目录/export/server)
1
| tar -zxvf hadoop-3.1.3.tar.gz
|
两个压缩包解压完后查看当前存在的文件,应该会两个文件夹
1 2
| [root@node01 server]# ls apache-hive-3.1.2-bin apache-hive-3.1.2-bin.tar.gz hadoop-3.1.3 hadoop-3.1.3.tar.gz
|
解决Hive与Hadoop之间guava版本差异
1 2 3 4
| [root@node01 server]# cd /export/server/apache-hive-3.1.2-bin/ [root@node01 server]# rm -rf lib/guava-19.0.jar #删除java包 [root@node01 server]# cd apache-hive-3.1.2-bin/ [root@node01 apache-hive-3.1.2-bin]# cp /export/server/hadoop-3.1.3/share/hadoop/common/lib/guava-27.0-jre.jar ./lib/
|
修改配置文件
vim hive-env.sh
1 2 3 4 5 6 7 8
| [root@node01 apache-hive-3.1.2-bin]# cd /export/server/apache-hive-3.1.2-bin/conf/ [root@node01 conf]# mv hive-env.sh.template hive-env.sh
vim hive-env.sh export HADOOP_HOME=/export/server/hadoop-3.1.3 export HIVE_CONF_DIR=/export/server/apache-hive-3.1.2-bin/conf export HIVE_AUX_JARS_PATH=/export/server/apache-hive-3.1.2-bin/lib
|
hive-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
| <configuration>
<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://node01:3306/hive3?createDatabaseIfNotExist=true&useSSL=false&uselInicode=true&characterEncoding=UTF-8</value> </property>
<property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property>
<property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property>
<property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property>
<property> <name>hive.server2.thrift.bind.host</name> <value>node01</value> </property>
<property> <name>hive.metastore.uris</name> <value>thrift://node01:9083</value> </property>
<property> <name>hive.metastore.event.db.notification.api.auth</name> <value>false</value> </property> </configuration>
|
- 上传mysql jdbc驱动到hive安装包lib下
hive-jdbc-3.1.2
1
| mysql-connector-java-5.1.32.jar
|
上传JDK到/export/server下
解压 jdk-8u211-linux-x64.tar.gz 文件到 /usr/local/src 目录
1 2 3
| tar -zxvf ./jdk-8u211-linux-x64.tar.gz -C /usr/local/src
mv /usr/local/src/jdk1.8.0_211 /usr/local/src/jdk
|
修改配置/etc/profile文件
1 2 3 4 5 6 7
| vim /etc/profile
#添加下面内容 export JAVA_HOME=/usr/local/src/jdk export PATH=$JAVA_HOME/bin:$PATH #加载配置 source /etc/profile
|
初始化元数据
1 2 3 4 5 6
| cd /export/server/apache-hive-3.1.2-bin/ #运行下面初始化,需要配置好JDK环境 bin/schematool -initSchema -dbType mysql -verbos #如果配置好但还是检测不到JDK环境,请用下面代码强制使用JDK环境运行 JAVA_HOME=/usr/local/src/jdk /export/server/apache-hive-3.1.2-bin/bin/schematool -initSchema -dbType mysql --verbose #初始化完成后会在mysql中创建74张表
|
PS:我是做到下一步开始hadoop命令执行的步骤发现视频教程没有讲安装hadoop,如果不安装后面的步骤是无法进行的,所以这里插入安装的步骤,我是按照自己学的顺序慢慢向前,视频讲的比较乱,我是需要什么东西再去找。
首先上传hadoop安装包到/export/server里
1 2 3 4 5 6 7 8 9
| #解压 tar -zxvf hadoop-3.1.3.tar.gz -C /usr/local/src #配置环境 vim /etc/profile #添加以下内容 export HADOOP_HOME=/usr/local/src/hadoop-3.1.3 export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH #加载配置 source /etc/profile
|
配置hadoop-env.sh
1 2 3 4 5 6 7 8 9 10
| #在文件最后添加 export JAVA_HOME=/usr/local/src/jdk export HDFS_NAMENODE_USER=root export HDFS_DATANODE_USER=root export HDFS_SECONDARYNAMENODE_USER=root export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root
export HADOOP_PID_DIR=/data/hadoop/pids export HADOOP_LOG_DIR=/data/hadoop/logs
|
vim core-site.xml编辑文件
在configuration标签中插入

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| <property> <name>fs.defaultFs</name> <value>hdfs://node01:8020</value> </property>
<property> <name>hadoop.tmp.dir</name> <value>/export/data/hadoop-3.3.0</value> </property>
<property> <name>hadoop.http.staticuser.user</name> <value>root</value> </property>
<property> <name>hadoop.proxyuser.root.hosts</name> <value></value> </property>
<property> <name>hadoop.proxyuser.root.groups</name> <value></value> </property>
<property> <name>fs.trash.interval</name> <value>1440</value> </property>
|
vim hdfs-site.xml
1 2 3 4 5
| <property> <name>dfs.namenode.secondary.http-address</name> <value>node02:9868</value> </property>
|
vim mapred-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
| <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
<property> <name>mapreduce.jobhistory.address</name> <value>node01:10020</value> </property>
<property> <name>mapreduce.jobhistory.webapp.address</name> <value>node01:19888</value> </property>
<property> <name>yarn.app.mapreduce.am.env</name> <value>HADOOP_MAPRED_HOME=$(HADOOP_HOME)</value> </property>
<property> <name>mapreduce.map.env</name> <value>HADOOP_MAPRED_HOME=$(HADOOP_HOME)</value> </property>
<property> <name>mapreduce.reduce.env</name> <value>HADOOP_MAPRED_HOME=$(HADOOP_HOME)</value> </property>
|
vim yarn-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
| <property> <name>yarn.resourcemanager.hostname</name> <value>node01</value> </property>
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>
<property> <name>yarn.nodemanager.pmem-check-enabled</name> <value>false</value> </property>
<property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property>
<property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property>
<property> <name>yarn.log.server.url</name> <value>http://node01:19888/jobhistory/logs</value> </property>
<property> <name>yarn.log-aggregation.retain-seconds</name> <value>604800</value> </property>
|
编辑文件vim workers
1 2 3 4
| node 01 node 02 node 03
|
分发同步hadoop安装包
1 2 3 4
| cd /export/server
scp -r hadoop-3.1.3 root@node02:$PWD scp -r hadoop-3.1.3 root@node03:$PWD
|
配置profile,node2和node3也要配置
1 2 3 4 5 6
| #添加以下内容 export HADOOP_HOME=/usr/local/src/hadoop-3.1.3 export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH、
#加载配置文件 source /etc/profile
|
Hadoop集群启动(node01)
(首次启动)格式化namenode
脚本一键启动
在hdfs创建hive存储目录(如存在则不用操作)
1 2 3 4
| hadoop fs -mkdir /tmp hadoop fs -mkdir -p /user/hive/warehouse hadoop fs -chmod g+w /tmp hadoop fs -chmod g+w /user/hive/warehouse
|
1.启动metastore服务
1 2 3 4 5 6 7 8
| #前台启动 关闭ctrl+c /export/server/apache-hive3.1.2-bin/bin/hive --service metastore
#前台启动 开启debug日志 /export/server/apache-hive3.1.2-bin/bin/hive --service metastore --hiveconf hive.root.logger=DEBUG,console
#后台启动 进程挂起 关闭使用jps+ kill -9 nohup /export/server/apache-hive3.1.2-bin/bin/bin/hive --service metastore &
|
2.启动hiveserver2服务
1
| nohup /export/server/apache-hive3.1.2-bin/bin/bin/hive --service hiveserver2 &
|
node01
安装好centos后需要对网卡进行更改,将DHCP更改为static
1
| vim /etc/sysconfig/network-scripts/ifcfg-ens33
|
将BOOTPROTO="dhcp"改为BOOTPROTO="static"并且手动配置IP

IP配置需要根据自己的虚拟网卡进行更改,左上角编辑-虚拟网络编辑器


根据虚拟机网卡进行IP配置,我的IP改为192.168.122.7,更改后使用命令init 6进行重启。
修改主机名
1 2 3 4 5 6 7 8 9
| #修改主机名 hostnamectl set-hostname node01 --static
#hosts映射 vim /etc/hosts #将里面内容删掉,替换以下内容,根据实际情况更改 192.168.122.7 node01 192.168.122.8 node02 192.168.122.9 node03
|
node01免密登录(只需配置node01)
1 2 3 4 5 6 7 8 9 10 11 12 13
| #node01生成公钥私钥(一路回车) ssh-keygen
#node01配置免密登录到node1 node2 node3 ssh-copy-id node01 ssh-copy-id node02 ssh-copy-id node03
#验证ssh免密登录 ssh node02
#输出[root@node01 ~]# ssh node03 #Last login: Wed Oct 15 20:00:24 2025 from 192.168.122.1
|
永久关闭防火墙(firewalld)
1 2 3 4 5 6 7 8 9
| sudo systemctl stop firewalld
sudo systemctl disable firewalld
sudo systemctl status firewalld | grep "Active" sudo systemctl is-enabled firewalld
|