本文章基于尚硅谷Hadoop 3.x视频进行总结,仅作为学习交流使用 视频链接如下:30_尚硅谷_Hadoop_入门_集群配置_哔哩哔哩_bilibili
集群配置整体思路
1.切换到/opt/module/hadoop-3.3.4/etc/hadoop,配置core-site.xml、hdfs-site.xml、yarn-site.xml、mapred-site.xml,分发hadoop文件夹
第一部分.集群配置
1.切换到配置文件目录
[atguigu@hadoop102 hadoop]$ pwd
/opt/module/hadoop-3.3.4/etc/hadoop
2.配置核心配置文件
配置core-site.xml
[atguigu@hadoop102 hadoop]$ vim core-site.xml
<configuration>
<!-- 指定NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop102:8020</value>
</property><!-- 指定hadoop数据的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-3.3.4/data</value>
</property>
</configuration>
配置hdfs-site.xml
[atguigu@hadoop102 hadoop]$ vim hdfs-site.xml
<configuration>
<!-- nn web端访问地址-->
<property>
<name>dfs.namenode.http-address</name>
<value>hadoop102:9870</value>
</property>
<!-- 2nn web端访问地址-->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop104:9868</value>
</property>
</configuration>
配置yarn-site.xml
[atguigu@hadoop102 hadoop]$ vim yarn-site.xml
<configuration>
<!-- 指定MR走shuffle -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property><!-- 指定ResourceManager的地址-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop103</value>
</property><!-- 环境变量的继承 -->
<property>
<name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
</configuration>
配置mapred-site.xml
[atguigu@hadoop102 hadoop]$ vim mapred-site.xml
<configuration>
<!-- 指定MapReduce程序运行在Yarn上 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
3.分发配置文件
[atguigu@hadoop102 hadoop]$ cd ..
[atguigu@hadoop102 etc]$ xsync hadoop/
第二部分.群起集群并测试
1.配置workers
切换到配置文件目录
[atguigu@hadoop102 hadoop]$ pwd
/opt/module/hadoop-3.3.4/etc/hadoop
配置 workers
[atguigu@hadoop102 hadoop]$ vim workers
hadoop102
hadoop103
hadoop104
注意:该文件中添加的内容结尾不允许有空格,文件中不允许有空行。
分发workers文件
[atguigu@hadoop102 hadoop]$ xsync workers
2.启动集群
返回hadoop根目录
[atguigu@hadoop102 hadoop-3.3.4]$ pwd
/opt/module/hadoop-3.3.4
格式化NameNode
[atguigu@hadoop102 hadoop-3.3.4]$ hdfs namenode -format
启动HDFS
[atguigu@hadoop102 hadoop-3.3.4]$ sbin/start-dfs.sh
Starting namenodes on [hadoop102]
Starting datanodes
hadoop104: WARNING: /opt/module/hadoop-3.3.4/logs does not exist. Creating.
hadoop103: WARNING: /opt/module/hadoop-3.3.4/logs does not exist. Creating.
Starting secondary namenodes [hadoop104]
#查看已经开启的服务
[atguigu@hadoop102 hadoop-3.3.4]$ jps
3990 DataNode
3832 NameNode
4219 Jps
#Web端查看HDFS的NameNode
(a)浏览器中输入:http://hadoop102:9870
(b)查看HDFS上存储的数据信息
在配置了ResourceManager的节点(hadoop103)启动YARN
[atguigu@hadoop103 hadoop-3.3.4]$ sbin/start-yarn.sh
#Web端查看YARN的ResourceManager
(a)浏览器中输入:http://hadoop103:8088
(b)查看YARN上运行的Job信息