首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >YARN时间线服务v2无法启动

YARN时间线服务v2无法启动
EN

Stack Overflow用户
提问于 2018-08-11 00:04:59
回答 1查看 6.1K关注 0票数 4

我在AWS上设置了一个测试HDP集群,用于评估项目。Ambari UI报告了一些错误,当我通过它们重启必要的服务时,我遇到了使用YARN的麻烦。启动YARN的时间轴服务阅读器V2时,出现错误

代码语言:javascript
复制
2018-08-10 15:51:06,400 INFO  [main] client.RpcRetryingCallerImpl: Call exception, tries=15, retries=15, started=129034 ms ago, cancelled=false, msg=Call to HOSTNAME/IPADDRESS:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: HOSTNAME/IPADDRESS:17020, details=row 'prod.timelineservice.entity' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=HOSTNAME,17020,1533827052949, seqNum=-1

这最终导致了

代码语言:javascript
复制
stderr: 
Traceback (most recent call last):
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 982, in restart
    self.status(env)
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 88, in status
    check_process_status(pid_file)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/check_process_status.py", line 43, in check_process_status
    raise ComponentIsNotRunning()
ComponentIsNotRunning

The above exception was the cause of the following exception:

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 108, in <module>
    ApplicationTimelineReader().execute()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute
    method(env)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 993, in restart
    self.start(env, upgrade_type=upgrade_type)
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 51, in start
    hbase(action='start')
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 80, in hbase
    createTables()
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 147, in createTables
    logoutput=True)
  File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
    self.env.run()
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
    provider_action()
  File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
    returns=self.resource.returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
    result = function(command, **kwargs)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 308, in _call
    raise ExecuteTimeoutException(err_msg)
resource_management.core.exceptions.ExecuteTimeoutException: Execution of 'ambari-sudo.sh su yarn-ats -l -s /bin/bash -c 'export  PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/var/lib/ambari-agent'"'"' ; sleep 10;export HBASE_CLASSPATH_PREFIX=/usr/hdp/3.0.0.0-1634/hadoop-yarn/timelineservice/*; /usr/hdp/3.0.0.0-1634/hbase/bin/hbase --config /usr/hdp/3.0.0.0-1634/hadoop/conf/embedded-yarn-ats-hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -Dhbase.client.retries.number=35 -create -s'' was killed due timeout after 300 seconds

哪个组件需要重新启动才能使纱线恢复健康状态?将来调试该问题的正确方法是什么?

EN

回答 1

Stack Overflow用户

发布于 2019-02-27 08:51:09

如果您进入“后台操作”( Ambari UI中的齿轮图标),然后转到Timeline Service V2 start链接(您可能必须先单击运行Timeline服务的计算机才能到达),您的右上角应该有链接,上面写着“复制”和“打开”。这些将有望更详细地向您显示错误日志。

在我的例子中,时间轴服务V2无法启动,因为系统没有足够的内存。这是一个很小的VM集群,每台机器上只有2 2GB的RAM。我通过更详细的错误日志发现它给出了内存不足的错误,所以当我将VM内存提升到4 4GB时,它就能够运行了。我最好的猜测是你在运行Ambari UI的主NameNode上的内存不够。根据您在主NameNode上运行的服务的数量,似乎需要一些关于4GB+的东西。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/51790281

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档