首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Ambari无法运行自定义钩子来修改用户单元

Ambari无法运行自定义钩子来修改用户单元
EN

Stack Overflow用户
提问于 2019-11-25 23:27:22
回答 1查看 1.5K关注 0票数 0

试图通过Ambari (v2.7.3.0) (HDP 3.1.0.0-78)将客户端节点添加到集群中,并看到奇怪的错误

代码语言:javascript
复制
stderr: 
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 38, in <module>
    BeforeAnyHook().execute()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 31, in hook
    setup_users()
  File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/shared_initialization.py", line 51, in setup_users
    fetch_nonlocal_groups = params.fetch_nonlocal_groups,
  File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
    self.env.run()
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
    provider_action()
  File "/usr/lib/ambari-agent/lib/resource_management/core/providers/accounts.py", line 90, in action_create
    shell.checked_call(command, sudo=True)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
    result = function(command, **kwargs)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']2019-11-25 13:07:58,000 - Reporting component version failed
Traceback (most recent call last):
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute
    self.save_component_version_to_structured_out(self.command_name)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out
    stack_select_package_name = stack_select.get_package_name()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name
    package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages
    supported_packages = get_supported_packages()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages
    raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path))
Fail: Unable to query for supported packages using /usr/bin/hdp-select



 stdout:
2019-11-25 13:07:57,644 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=None -> 3.1
2019-11-25 13:07:57,651 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2019-11-25 13:07:57,652 - Group['livy'] {}
2019-11-25 13:07:57,654 - Group['spark'] {}
2019-11-25 13:07:57,654 - Group['ranger'] {}
2019-11-25 13:07:57,654 - Group['hdfs'] {}
2019-11-25 13:07:57,654 - Group['zeppelin'] {}
2019-11-25 13:07:57,655 - Group['hadoop'] {}
2019-11-25 13:07:57,655 - Group['users'] {}
2019-11-25 13:07:57,656 - User['yarn-ats'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-11-25 13:07:57,658 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-11-25 13:07:57,659 - Modifying user hive
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
2019-11-25 13:07:57,971 - The repository with version 3.1.0.0-78 for this command has been marked as resolved. It will be used to report the version of the component which was installed
2019-11-25 13:07:58,000 - Reporting component version failed
Traceback (most recent call last):
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute
    self.save_component_version_to_structured_out(self.command_name)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out
    stack_select_package_name = stack_select.get_package_name()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name
    package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages
    supported_packages = get_supported_packages()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages
    raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path))
Fail: Unable to query for supported packages using /usr/bin/hdp-select

Command failed after 1 tries

问题似乎是

代码语言:javascript
复制
resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd

起因于

代码语言:javascript
复制
2019-11-25 13:07:57,659 - Modifying user hive
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']

在将主机添加到集群之前,手动添加ambari 1.1.repo和yum-安装hdp-select会显示相同的错误消息,这一事实进一步加强了这一点,只是将其截断到这里显示的stdout/err的部分。

跑步时

代码语言:javascript
复制
[root@HW001 .ssh]# /usr/bin/hdp-select versions
3.1.0.0-78

从ambari服务器节点,我可以看到命令正在运行。

查看钩子脚本试图运行/访问的内容,我看到了

代码语言:javascript
复制
[root@client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py
-rw-r--r-- 1 root root 1.2K Nov 25 10:51 /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py
[root@client001~]# ls -lha /var/lib/ambari-agent/data/command-632.json
-rw------- 1 root root 545K Nov 25 13:07 /var/lib/ambari-agent/data/command-632.json
[root@client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY
total 0
drwxr-xr-x 4 root root  34 Nov 25 10:51 .
drwxr-xr-x 8 root root 147 Nov 25 10:51 ..
drwxr-xr-x 2 root root  34 Nov 25 10:51 files
drwxr-xr-x 2 root root 188 Nov 25 10:51 scripts
[root@client001~]# ls -lha /var/lib/ambari-agent/data/structured-out-632.json
ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory
[root@client001~]# ls -lha /var/lib/ambari-agent/tmp
total 96K
drwxrwxrwt  3 root root 4.0K Nov 25 13:06 .
drwxr-xr-x 10 root root  267 Nov 25 10:50 ..
drwxr-xr-x  6 root root 4.0K Nov 25 13:06 ambari_commons
-rwx------  1 root root 1.4K Nov 25 13:06 ambari-sudo.sh
-rwxr-xr-x  1 root root 1.6K Nov 25 13:06 create-python-wrap.sh
-rwxr-xr-x  1 root root 1.6K Nov 25 10:50 os_check_type1574715018.py
-rwxr-xr-x  1 root root 1.6K Nov 25 11:12 os_check_type1574716360.py
-rwxr-xr-x  1 root root 1.6K Nov 25 11:29 os_check_type1574717391.py
-rwxr-xr-x  1 root root 1.6K Nov 25 13:06 os_check_type1574723161.py
-rwxr-xr-x  1 root root  16K Nov 25 10:50 setupAgent1574715020.py
-rwxr-xr-x  1 root root  16K Nov 25 11:12 setupAgent1574716361.py
-rwxr-xr-x  1 root root  16K Nov 25 11:29 setupAgent1574717392.py
-rwxr-xr-x  1 root root  16K Nov 25 13:06 setupAgent1574723163.py

注意,这里有ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory。但不确定这是否正常。

有谁知道是什么原因导致了这种情况,或者从这一点引发的任何调试提示?

更新01:在错误跟踪的最后一行附近添加一些日志打印行,即。File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages,我打印代码和stdout:

代码语言:javascript
复制
2
ambari-python-wrap: can't open file '/usr/bin/hdp-select': [Errno 2] No such file or directory

那到底怎么回事?它希望hdp-select已经存在,但是如果我事先手动安装这个二进制文件,那么ambari就会抱怨。当我手动安装它(使用与其他现有集群节点相同的repo文件)时,我看到的是.

代码语言:javascript
复制
0
Packages:
  accumulo-client
  accumulo-gc
  accumulo-master
  accumulo-monitor
  accumulo-tablet
  accumulo-tracer
  atlas-client
  atlas-server
  beacon
  beacon-client
  beacon-server
  druid-broker
  druid-coordinator
  druid-historical
  druid-middlemanager
  druid-overlord
  druid-router
  druid-superset
  falcon-client
  falcon-server
  flume-server
  hadoop-client
  hadoop-hdfs-client
  hadoop-hdfs-datanode
  hadoop-hdfs-journalnode
  hadoop-hdfs-namenode
  hadoop-hdfs-nfs3
  hadoop-hdfs-portmap
  hadoop-hdfs-secondarynamenode
  hadoop-hdfs-zkfc
  hadoop-httpfs
  hadoop-mapreduce-client
  hadoop-mapreduce-historyserver
  hadoop-yarn-client
  hadoop-yarn-nodemanager
  hadoop-yarn-registrydns
  hadoop-yarn-resourcemanager
  hadoop-yarn-timelinereader
  hadoop-yarn-timelineserver
  hbase-client
  hbase-master
  hbase-regionserver
  hive-client
  hive-metastore
  hive-server2
  hive-server2-hive
  hive-server2-hive2
  hive-webhcat
  hive_warehouse_connector
  kafka-broker
  knox-server
  livy-client
  livy-server
  livy2-client
  livy2-server
  mahout-client
  oozie-client
  oozie-server
  phoenix-client
  phoenix-server
  pig-client
  ranger-admin
  ranger-kms
  ranger-tagsync
  ranger-usersync
  shc
  slider-client
  spark-atlas-connector
  spark-client
  spark-historyserver
  spark-schema-registry
  spark-thriftserver
  spark2-client
  spark2-historyserver
  spark2-thriftserver
  spark_llap
  sqoop-client
  sqoop-server
  storm-client
  storm-nimbus
  storm-slider-client
  storm-supervisor
  superset
  tez-client
  zeppelin-server
  zookeeper-client
  zookeeper-server
Aliases:
  accumulo-server
  all
  client
  hadoop-hdfs-server
  hadoop-mapreduce-server
  hadoop-yarn-server
  hive-server

Command failed after 1 tries

更新02:从File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 322打印一些自定义日志(打印err_msgcodeouterr的值),即。

代码语言:javascript
复制
....
    312   if throw_on_failure and not code in returns:
    313     err_msg = Logger.filter_text("Execution of '{0}' returned {1}. {2}".format(command_alias, c        ode, all_output))
    314
    315     #TODO remove
    316     print("\n----------\nMY LOGS\n----------\n")
    317     print(err_msg)
    318     print(code)
    319     print(out)
    320     print(err)
    321
    322     raise ExecutionFailed(err_msg, code, out, err)
    323
    324   # if separate stderr is enabled (by default it's redirected to out)
    325   if stderr == subprocess32.PIPE:
    326     return code, out, err
    327
    328   return code, out
....

我明白了

代码语言:javascript
复制
Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd
6
usermod: user 'hive' does not exist in /etc/passwd

Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-816.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-816.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
2019-11-26 10:25:46,928 - The repository with version 3.1.0.0-78 for this command has been marked as resolved. It will be used to report the version of the component which was installed

因此,它似乎无法创建hive用户(尽管在此之前创建yarn-ats用户似乎没有问题)

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-11-26 21:18:54

在放弃并尝试自己手动创建蜂巢用户之后,我看到

代码语言:javascript
复制
[root@airflowetl ~]# useradd -g hadoop -s /bin/bash hive
useradd: user 'hive' already exists
[root@airflowetl ~]# cat /etc/passwd | grep hive
<nothing>
[root@airflowetl ~]# id hive
uid=379022825(hive) gid=379000513(domain users) groups=379000513(domain users)

这个现有用户的uid看起来是这样的,并且不在/etc/passwd文件中,这使我认为,已经有一些已有的Active Directory用户(这个客户端节点通过已安装的SSSD与其同步)已经有了这个名称。检查我们的AD用户,结果证明这是真的。

在重新运行客户机主机Ambari之前,暂时停止与AD (service sssd stop)的同步(因为不确定是否可以让服务器根据单个用户来忽略AD同步),我解决了这个问题。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/59041580

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档