首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >将Apache Aurora与dcos集成

将Apache Aurora与dcos集成
EN

Stack Overflow用户
提问于 2016-11-08 17:33:54
回答 1查看 557关注 0票数 0

只有两个mesos框架支持GPU资源: Marathon和Aurora。我想使用GPU资源在mesos代理上启动批处理作业。所以,只有Aurora支持这样的工作。但目前dcos还没有正式支持Aurora。我尝试过集成,但没有成功。DCOS Mesos大师不注册Aurora框架,但参展商为Aurora创建记录。我在mesos的日志中找不到任何有关极光的记录。下面是我的aurora-scheduler配置:

代码语言:javascript
复制
 #!/bin/bash

 GLOG_v=0
 LIBPROCESS_PORT=8083
 #LIBPROCESS_IP=127.0.0.1

 JAVA_HOME=/opt/mesosphere/active/java/usr/java

 JAVA_OPTS="-server -Djava.library.path='/opt/mesosphere/lib;/usr/lib;/usr/lib64'"

 PATH=$PATH:/opt/mesosphere/bin

 MESOS_NATIVE_JAVA_LIBRARY=/opt/mesosphere/lib/libmesos.so

 LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/mesosphere/lib

 JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/opt/mesosphere/lib

 # Flags control the behavior of the Aurora scheduler.
 # For a full list of available flags, run /usr/lib/aurora/bin/aurora-scheduler -help
 AURORA_FLAGS=(
    # The name of this cluster.
   -cluster_name='My Cluster'

    # The HTTP port upon which Aurora will listen.
   -http_port=8088

    # The ZooKeeper URL of the ZNode where the Mesos master has registered.
    -mesos_master_address=zk://master_ip1:2181,master_ip2:2181,master_ip3:2181/mesos

    # The ZooKeeper quorum to which Aurora will register itself.
    -zk_endpoints=master_ip1:2181,master_ip1:2181,master_ip1:2181

    # The ZooKeeper ZNode within the specified quorum to which Aurora will register its
    # ServerSet, which keeps track of all live Aurora schedulers.
    -serverset_path='/aurora/scheduler'

    # Allows the scheduling of containers of the provided type.
    -allowed_container_types='DOCKER,MESOS'

    -allow_docker_parameters=true
    -allow_gpu_resource=true
    -executor_user=root
    ### Native Log Settings ###

    # The native log serves as a replicated database which stores the state of the
    # scheduler, allowing for multi-master operation.

    # Size of the quorum of Aurora schedulers which possess a native log.  If running in
    # multi-master mode, consult the following document to determine appropriate values:
    #
    # https://aurora.apache.org/documentation/latest/deploying-aurora-scheduler/#replicated-log-configuration
    -native_log_quorum_size=2
    # The ZooKeeper ZNode to which Aurora will register the locations of its replicated log.
    -native_log_zk_group_path='/aurora/replicated-log'
    # The local directory in which an Aurora scheduler can find Aurora's replicated log.
    -native_log_file_path='/var/lib/aurora/scheduler/db'
    # The local directory in which Aurora schedulers will place state backups.
    -backup_dir='/var/lib/aurora/scheduler/backups'

   ### Thermos Settings ###

   # The local path of the Thermos executor binary.
    -thermos_executor_path='/usr/bin/thermos_executor'
   # Flags to pass to the Thermos executor.
    -thermos_executor_flags='--announcer-ensemble 127.0.0.1:2181')
EN

回答 1

Stack Overflow用户

发布于 2016-11-17 19:40:12

我设法在DC/OS 1.8上启动了Aurora框架。由于mesos和java都嵌入到DS/OS中,并且有自定义的配置,特别是我必须用docker隔离极光的路径。所以,你可以在我的docker repo上找到Aurora组件的docker镜像:Aurora schedulerAurora executor。这也允许我或其他人创建一个宇宙包。

在DC/OS上部署Aurora Scheduler的步骤:

  1. 在每个DC/OS代理上创建文件夹/var/lib/aurora
  2. 使用下面的JSON在所有DC/OS代理上启动极光执行器:

{ "id":"/aurora/aurora-executor","env":{ "MESOS_ROOT":"/var/lib/mesos/slave“},”实例“:20,"cpus":1,”内存“:128,”磁盘“:0,"gpus":0,”约束“:[”主机名“,”唯一“],“容器”:{ "docker":{ "image":"krot/aurora-executor","forcePullImage":true,“特权”:false,“网络”:“主机”},“类型”:"DOCKER","volumes":{ "containerPath":"/var/lib/mesos/slave","hostPath":“/var/lib/mesos/hostPath”,"mode":"RW“},{ "containerPath":"/var/lib/aurora","hostPath":"/var/lib/aurora","mode":"RW”}}

备注。"instances"设置为代理数量。2a。另一种部署aurora执行器的方法(应该在每个DC/OS代理上执行):

sudo yum install -y python2 wget wget -c https://apache.bintray.com/aurora/centos-7/aurora-executor-0.16.0-1.el7.centos.aurora.x86_64.rpm rpm -Uhv --nodeps sudo

进行编辑以添加--mesos-root标志,结果如下所示:

"EXTRA_SCHEDULER_ARGS":"-allow_gpu_resource=true“},”实例“:3,"cpus":1,”内存“:1024,”磁盘“:0,"gpus":0,”约束“:[”主机名“,”唯一“],“容器”:{ "docker":{ "image":"krot/aurora-scheduler","forcePullImage":true,“特权”:false,“网络”:“主机”},“类型”:"DOCKER","volumes":{ "containerPath":"/var/lib/aurora","hostPath":"/var/lib/aurora","mode":"RW“}

备注: -allow_gpu_resource=true支持图形处理器。可以使用环境变量配置Aurora调度程序。有关详细信息,请参阅documentation

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/40483367

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档