首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >与ambari阅读后无法启动NodeManager

与ambari阅读后无法启动NodeManager
EN

Stack Overflow用户
提问于 2014-07-10 11:55:17
回答 1查看 3.7K关注 0票数 1

因此,我删除了我的主机,然后尝试再次添加它。DataNode工作正常,但我无法让Nodemanager工作。删除后,我删除了带有yum的hadoop纱线包,然后使用ambari重新安装了它。现在,当我尝试使用ambari启动Nodemanager时,我得到了以下错误:

代码语言:javascript
复制
2014-05-23 19:40:41,507 - Execute['export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop-yarn/sbin/yarn-daemon.sh --config /etc/hadoop/conf start nodemanager'] {'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1', 'user': 'yarn'}
2014-05-23 19:40:42,570 - Execute['ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1'] {'initial_wait': 5, 'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1', 'user': 'yarn'}
2014-05-23 19:40:47,621 - Error while executing command 'start':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 112, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/scripts/nodemanager.py", line 42, in start
    action='start'
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/scripts/service.py", line 51, in service
    initial_wait=5
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149, in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115, in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 239, in action_run
    raise ex
Fail: Execution of 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid` >/dev/null 2>&1' returned 1.

所以我不太明白问题所在。如果我尝试手动启动它,用纱线入口器启动,我会得到以下错误:

代码语言:javascript
复制
14/07/10 13:44:48 FATAL nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: Disallowed NodeManager from  r3888, Sending SHUTDOWN signal to the NodeManager.
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:197)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:358)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:404)
Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: Disallowed NodeManager from  r3888, Sending SHUTDOWN signal to the NodeManager.
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:265)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:190)
        ... 6 more
14/07/10 13:44:48 INFO nodemanager.NodeManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at r3888

是否有人在使用ambari的主机上删除/添加namenode时遇到类似的问题?我想避免把主人从地面上完全设置起来。

EN

回答 1

Stack Overflow用户

发布于 2014-10-06 14:34:03

实际上,我们遇到了无法重复使用集群之前所知道的名称的相同情况。原来节点被明确地列在纱线资源管理器上的排除列表中,因此:

  • /etc/hadoop/conf/yarn.exclude中删除要重新使用的名称
  • 调用yarn rmadmin -refreshNodes以便纱线重新读取此配置文件。

这在我们的案例中起了作用,节点经理们很好地启动并干净地重新注册。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/24675894

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档