`
wbj0110
  • 浏览: 1553630 次
  • 性别: Icon_minigender_1
  • 来自: 上海
文章分类
社区版块
存档分类
最新评论

Cloudera CDH5 RM HA功能验证

阅读更多

简介最新的Cloudera CDH5.0.0 beta版本已经支撑RMHA, 笔者为此简单验证了RM HA的功能后续将持续解析其HA的道理以及其与社区RM HA的差别. 

 

 

集群属下与RM failover功能性验证 

 

  1. 硬件筹办 

    四台机械, bj1, bj3, bj4, bj5 筹办好响应的景象(包含ssh互通, java景象). 

    角色申明, bj1为rm1, bj3为rm2, bj4和bj4为slave. 

    Zookeeper属下在bj1上. 

  2. Hadoop版本筹办http://archive.cloudera.com/cdh5/cdh/5/ 响应的CDH5版本hadoop-2.2.0-cdh5.0.0-beta-1.tar.gz(包含属下包和原代码),然后属下到每台slave中. 
  3. Zookeeper安装在bj1, 最新Zookeeper, 解压后设备 conf/zoo.cfg文件, 然后启动. 

    [yuling.sh@v125050024 ~]¥ cd zookeeper-3.4.3/ 

    [yuling.sh@v125050024 zookeeper-3.4.3]¥ cp conf/zoo_sample.cfg conf/zoo.cfg 

    [yuling.sh@v125050024 zookeeper-3.4.3]¥ bin/zkServer.sh start 

  4. 设备文件筹办,参考(https://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-High-Availability-Guide/cdh5hag_cfg_RM_HA.html). 
    1. etc/hadoop/slaves 

      bj4 

      bj5 

    2. etc/hadoop/hdfs-site.xml 

    <property> 

    <name>fs.default.name</name> 

    <value>hdfs://bj1:9000</value> 

    </property> 

    1. etc/hadoop/mapred-site.xml 

      <property> 

      <name>mapreduce.framework.name</name> 

      <value>yarn</value> 

      </property> 

    2. etc/hadoop/yarn-site.xml设备如下 

    除了yarn.resourcemanager.ha.id须要稍作批改外, 其它设备都可以一样. 

    <!-- Resource Manager Configs --> 

    <property> 

    <name>yarn.resourcemanager.connect.retry-interval.ms</name> 

    <value>2000</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.ha.enabled</name> 

    <value>true</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name> 

    <value>true</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.ha.rm-ids</name> 

    <value>rm1,rm2</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.ha.id</name> 

    <value>rm2</value> <!—注释, rm1上设备为rm1, rm2上设备rm2--> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.store.class</name> 

    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.zk.state-store.address</name> 

    <value>bj1:2181</value> 

    </property> 


    ?

    <property> 

    <name>ha.zookeeper.quorum</name> 

    <value>bj1:2181</value> 

    </property> 


    ?

    <property> 

    <name>yarn.resourcemanager.recovery.enabled</name> 

    <value>true</value> 

    </property> 

    <property> 

    <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name> 

    <value>5000</value> 

    </property> 

    <!-- RM1 configs --> 

    <property> 

    <name>yarn.resourcemanager.address.rm1</name> 

    <value>bj1:23140</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.scheduler.address.rm1</name> 

    <value>bj1:23130</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.webapp.address.rm1</name> 

    <value>bj1:23188</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.resource-tracker.address.rm1</name> 

    <value>bj1:23125</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.admin.address.rm1</name> 

    <value>bj1:23141</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.ha.admin.address.rm1</name> 

    <value>bj1:23142</value> 

    </property> 

    <!-- RM2 configs --> 

    <property> 

    <name>yarn.resourcemanager.address.rm2</name> 

    <value>bj3:23140</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.scheduler.address.rm2</name> 

    <value>bj3:23130</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.webapp.address.rm2</name> 

    <value>bj3:23188</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.resource-tracker.address.rm2</name> 

    <value>bj3:23125</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.admin.address.rm2</name> 

    <value>bj3:23141</value> 

    </property> 

    <property> 

    <name>yarn.resourcemanager.ha.admin.address.rm2</name> 

    <value>bj3:23142</value> 

    </property> 

    <!-- Node Manager Configs --> 

    <property> 

    <description>Address where the localizer IPC is.</description> 

    <name>yarn.nodemanager.localizer.address</name> 

    <value>0.0.0.0:23344</value> 

    </property> 

    <property> 

    <description>NM Webapp address.</description> 

    <name>yarn.nodemanager.webapp.address</name> 

    <value>0.0.0.0:23999</value> 

    </property> 

    <property> 

    <name>yarn.nodemanager.aux-services</name> 

    <value>mapreduce_shuffle</value> 

    </property> 

    <property> 

    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 

    <value>org.apache.hadoop.mapred.ShuffleHandler</value> 

    </property> 

    <property> 

    <name>yarn.nodemanager.local-dirs</name> 

    <value>/tmp/pseudo-dist/yarn/local</value> 

    </property> 

    <property> 

    <name>yarn.nodemanager.log-dirs</name> 

    <value>/tmp/pseudo-dist/yarn/log</value> 

    </property> 

    <property> 

    <name>mapreduce.shuffle.port</name> 

    <value>23080</value> 

    </property> 

  5. 起首启动HDFS 

    bin/hadoop namenode –format 

    sbin/start-dfs.sh 

    网页上查看Namenode:  http://bj1:50070/dfshealth.jsp 

 

  1. 启动Yarn 

    rm1上启动resourcemanager 

    sbin/yarn-daemon.sh start resourcemanager 

 

rm2上启动resourcemanager 

sbin/yarn-daemon.sh start resourcemanager 


?

slave启动NodeManager 

????sbin/yarn-daemons.sh start nodemanager 


查看rm1和mr2的网页. http://bj1:23188/cluster 和 http://bj3:23188/cluster 此中active RM的网页可以查看, stanby的RM无法查看网页. 

注: 若是yarn.resourcemanager.ha.automatic-failover.enabled设置为false, 则须要手动设置此中一个RM为active,负责两个RM都为standby. 

  1. 提交一个sleep功课测试 

    bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-cdh5.0.0-beta-1.jar sleep -m 1000 

    然后可以到网页上查看功课运行景象 

  2. 在功课运行过程中kill掉active的RM过程, 这时辰打开standby RM的网页,可以看到刚才提交的功课持续运行. 

    [yuling.sh@v125050024 hadoop-2.2.0-cdh5.0.0-beta-1]¥ jps 

    31333 ResourceManager 

    31671 Jps 

    29502 NameNode 

    25375 QuorumPeerMain 

    [yuling.sh@v125050024 hadoop-2.2.0-cdh5.0.0-beta-1]¥ kill 31333 

 

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics