首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >除非处于调试模式,否则Airflow Docker Swarm不会启动

除非处于调试模式,否则Airflow Docker Swarm不会启动
EN

Stack Overflow用户
提问于 2021-06-28 22:47:07
回答 1查看 105关注 0票数 1

我正在使用Docker Swarm跨多个EC2实例部署Airflow 2.0.1。在AWS管理器节点上,有set服务器、调度器和三个工作进程在运行,我将redis作为消息代理,设置了celery executor,以及将flower作为监控工具。另外还有2个工作节点,每个节点都有一个正在运行的工作节点。

我遇到了一个调度程序的问题。默认的运行状况检查即使在20分钟后也不会成功,即使运行状况检查只是对did服务器的一个小ping。它一直处于(health: starting)模式,直到health rather用SIGTERM 15终止了调度器。

这意味着工作进程(取决于调度程序)会一个接一个地失败。这一切都是在调度器实际正常工作并完成其工作,以及正在执行的任务和dags的情况下完成的。

奇怪的是,如果环境AIRFLOW__LOGGING__LOGGING_LEVEL设置为DEBUG,则运行状况检查会起作用,但如果它在INFO中,则不起作用。当我试图调试这个问题时,我遇到了这种行为。

这非常烦人,因为调试日志占用了大量的磁盘空间,而这显然不是我们想要的行为

我的设置如下: airflow.env:

代码语言:javascript
复制
PYTHONPATH=/opt/airflow/
AIRFLOW_UID=1000
AIRFLOW_GID=0
AIRFLOW_HOME=/opt/airflow/
AIRFLOW__CORE__AIRFLOW_HOME=/opt/airflow/
AIRFLOW__CORE__DAGS_FOLDER=/opt/airflow/dags
AIRFLOW__CORE__ENABLE_XCOM_PICKLING=true
AIRFLOW__CORE__EXECUTOR=CeleryExecutor
AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
AIRFLOW__CORE__FERNET_KEY=################
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=true
AIRFLOW__CORE__LOAD_EXAMPLES=false
AIRFLOW__CORE__PLUGINS_FOLDER=/plugins/
AIRFLOW__CORE__PARALLELISM=128
AIRFLOW__CORE__DAG_CONCURRENCY=32
AIRFLOW__CORE__MAX_ACTIVE_RUNS_PER_DAG=1
AIRFLOW__WEBSERVER__DAG_DEFAULT_VIEW=graph
AIRFLOW__WEBSERVER__LOG_FETCH_TIMEOUT_SEC=30
AIRFLOW__WEBSERVER__HIDE_PAUSED_DAGS_BY_DEFAULT=true
AIRFLOW__WEBSERVER__PAGE_SIZE=1000
AIRFLOW__WEBSERVER__NAVBAR_COLOR='#75eade'
AIRFLOW__SCHEDULER__CATCHUP_BY_DEFAULT=false
AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG
CELERY_ACKS_LATE=true
CELERY_WORKER_MAX_TASKS_PER_CHILD=500
C_FORCE_ROOT=true
AIRFLOW__CORE__REMOTE_LOGGING=true
AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER=s3://airflow-logs-docker/production_vm/
AIRFLOW__CORE__REMOTE_LOG_CONN_ID=aws_s3

docker-compose.yaml:

代码语言:javascript
复制
version: '3.7'

services:
  postgres:
    image: postgres:13
    env_file:
      - ./config/postgres_prod.env
    ports:
      - 5432:5432
    volumes:
      - postgres-db-volume:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-d", "postgres", "-U", "airflow"]
      interval: 5s
      retries: 5
    restart: always
    depends_on: []
    deploy:
      placement:
        constraints: [ node.role == manager ]


  redis:
    image: redis:latest
    env_file:
      - ./config/postgres_prod.env
    ports:
      - 6379:6379
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 30s
      retries: 50
    restart: always
    depends_on: []
    deploy:
      placement:
        constraints: [ node.role == manager ]

  airflow-webserver:
    image: airflow-ommax
    build:
      context: .
      dockerfile: Dockerfile
    env_file:
      - ./config/airflow.env
      - ./config/postgres_prod.env
    volumes:
      - ./:/opt/airflow
    user: "${AIRFLOW_UID:-1000}:${AIRFLOW_GID:-0}"
    command: webserver
    ports:
      - 8080:8080
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
      interval: 10s
      timeout: 10s
      retries: 5
    restart: always
    depends_on:
      - airflow-init
    deploy:
      placement:
        constraints: [ node.role == manager ]

  airflow-scheduler:
    image: airflow-ommax
    build:
      context: .
      dockerfile: Dockerfile
    env_file:
      - ./config/airflow.env
      - ./config/postgres_prod.env
    volumes:
      - ./:/opt/airflow
    user: "${AIRFLOW_UID:-1000}:${AIRFLOW_GID:-0}"
    command: scheduler
    restart: always
    depends_on:
      - airflow-init
    deploy:
      placement:
        constraints: [ node.role == manager ]

  airflow-worker1:
    image: airflow-ommax
    build:
      context: .
      dockerfile: Dockerfile
    env_file:
      - ./config/airflow.env
      - ./config/postgres_prod.env
    volumes:
      - ./:/opt/airflow
    user: "${AIRFLOW_UID:-1000}:${AIRFLOW_GID:-0}"
    command: celery worker
    restart: always
    ports:
    - 8791:8080
    depends_on:
      - airflow-scheduler
      - airflow-webserver
      - airflow-init
    deploy:
      placement:
        constraints: [ node.role == manager ]

  airflow-worker2:
    image: airflow-ommax
    build:
      context: .
      dockerfile: Dockerfile
    env_file:
      - ./config/airflow.env
      - ./config/postgres_prod.env
    volumes:
      - ./:/opt/airflow
    user: "${AIRFLOW_UID:-1000}:${AIRFLOW_GID:-0}"
    command: celery worker
    restart: always
    ports:
    - 8792:8080
    depends_on:
      - airflow-scheduler
      - airflow-webserver
      - airflow-init
    deploy:
      placement:
        constraints: [ node.role == manager ]


  airflow-worker3:
    image: airflow-ommax
    build:
      context: .
      dockerfile: Dockerfile
    env_file:
      - ./config/airflow.env
      - ./config/postgres_prod.env
    volumes:
      - ./:/opt/airflow
    user: "${AIRFLOW_UID:-1000}:${AIRFLOW_GID:-0}"
    command: celery worker
    restart: always
    ports:
    - 8793:8080
    depends_on:
      - airflow-scheduler
      - airflow-webserver
      - airflow-init
    deploy:
      placement:
        constraints: [ node.role == manager ]


  airflow-worker4:
    image: airflow-ommax
    build:
      context: .
      dockerfile: Dockerfile
    env_file:
      - ./config/airflow.env
      - ./config/postgres_prod.env
    volumes:
      - ./:/opt/airflow
    user: "${AIRFLOW_UID:-1000}:${AIRFLOW_GID:-0}"
    command: celery worker
    restart: always
    ports:
      - 8794:8080
    depends_on:
      - airflow-scheduler
      - airflow-webserver
      - airflow-init
    deploy:
      placement:
        constraints: [ node.role == manager ]


  airflow-worker-pt1:
    image: localhost:5000/myadmin/airflow-ommax
    build:
      context: /home/ubuntu/ommax_etl
      dockerfile: Dockerfile
    env_file:
      - ./config/airflow.env
      - ./config/postgres_prod.env
    volumes:
      - /home/ubuntu/ommax_etl/:/opt/airflow
    user: "${AIRFLOW_UID:-1000}:${AIRFLOW_GID:-0}"
    command: celery worker -q airflow_pt
    restart: always
    ports:
      - 8795:8080
    depends_on:
      - airflow-scheduler
      - airflow-webserver
      - airflow-init
    deploy:
      placement:
        constraints: [ node.role == worker ]

  airflow-worker-pt2:
    image: localhost:5000/myadmin/airflow-ommax
    build:
      context: /home/ubuntu/ommax_etl
      dockerfile: Dockerfile
    env_file:
      - ./config/airflow.env
      - ./config/postgres_prod.env
    volumes:
      - /home/ubuntu/ommax_etl/:/opt/airflow
    user: "${AIRFLOW_UID:-1000}:${AIRFLOW_GID:-0}"
    command: celery worker -q watchhawk
    restart: always
    ports:
      - 8796:8080
    depends_on:
      - airflow-scheduler
      - airflow-webserver
      - airflow-init
    deploy:
      placement:
        constraints: [ node.role == worker ]


  airflow-init:
    image: airflow-ommax
    build:
      context: .
      dockerfile: Dockerfile
    env_file:
      - ./config/airflow.env
      - ./config/postgres_prod.env
      - ./config/init.env
    volumes:
      - ./:/opt/airflow
    # user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}"
    user: "${AIRFLOW_UID:-1000}:${AIRFLOW_GID:-0}"
    command: version
    depends_on:
      - postgres
      - redis
    deploy:
      placement:
        constraints: [ node.role == manager ]


  flower:
    image: airflow-ommax
    build:
      context: .
      dockerfile: Dockerfile
    env_file:
      - ./config/airflow.env
      - ./config/postgres_prod.env
    volumes:
      - ./:/opt/airflow
    user: "${AIRFLOW_UID:-1000}:${AIRFLOW_GID:-0}"
    command: celery flower
    ports:
      - 5555:5555
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:5555/"]
      interval: 10s
      timeout: 10s
      retries: 5
    restart: always
    depends_on: []
    deploy:
      placement:
        constraints: [ node.role == manager ]


  selenium-chrome:
    image: selenium/standalone-chrome:latest
    ports:
      - 4444:4444
    deploy:
      placement:
        constraints: [ node.role == worker ]
    depends_on: []


volumes:
  postgres-db-volume:

Dockerfile:

代码语言:javascript
复制
FROM apache/airflow:2.0.1-python3.7
COPY config/requirements.txt /tmp/
RUN mkdir -p /home/airflow/.cache/zeep
RUN chmod -R 777 /home/airflow/.cache/zeep
RUN mkdir -p /home/airflow/.wdm
RUN chmod -R 777 /home/airflow/.wdm
RUN pip install -r /tmp/requirements.txt
EN

回答 1

Stack Overflow用户

发布于 2021-06-29 21:12:13

我做了一点源代码扫描,我能看到的唯一真正的实现取决于日志级别是在worker.py内部。

AIRFLOW__LOGGING__LOGGING_LEVEL不是DEBUG时,您设置的日志级别是多少?

这是我正在查看的代码片段。像这样的东西会出现在任何地方吗?

代码语言:javascript
复制
try:
   loglevel = mlevel(loglevel)
except KeyError:  # pragma: no cover
    self.die('Unknown level {0!r}.  Please use one of {1}.'.format(loglevel, '|'.join(l for l in LOG_LEVELS if isinstance(l, string_t))))
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68165462

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档