我正在用Ansible设置datadog代理来发送码头容器日志:https://docs.datadoghq.com/agent/basic_agent_usage/ansible/,但是我注意到没有发送任何日志。
下面:https://docs.datadoghq.com/logs/guide/log-collection-troubleshooting-guide/,我首先尝试用openssl s_client -connect intake.logs.datadoghq.com:10516发送一条测试消息,然后用<API_KEY> this is a test message发送一条测试消息,它没有输出任何内容,而且我注意到没有用“关闭”的方式发送日志。如果我检查/var/log/datadog/agent.log,我不会看到清单10516中的任何错误。
如果我检查状态:sudo datadog-agent status
===============
Agent (v7.38.1)
===============
...
Paths
=====
Config File: /etc/datadog-agent/datadog.yaml
conf.d: /etc/datadog-agent/conf.d
checks.d: /etc/datadog-agent/checks.d
...
==========
Logs Agent
==========
Reliable: Sending compressed logs in HTTPS to agent-http-intake.logs.datadoghq.com on port 443
BytesSent: 0
EncodedBytesSent: 0
LogsProcessed: 0
LogsSent: 0
container_collect_all
---------------------
- Type: docker
Status: Pending
BytesRead: 0
Average Latency (ms): 0
24h Average Latency (ms): 0
Peak Latency (ms): 0
24h Peak Latency (ms): 0但我没有看到任何解释,为什么码头日志收集是挂起的。我还在/var/lib/docker/上做了chmod 745,在/var/lib/docker/containers上做了744。
检查错误sudo cat /var/log/datadog/agent.log | grep ERROR:(pkg/forwarder/worker.go:184 in process) | Error while processing transaction: error while sending transaction, rescheduling it: Post "https://7-38-1-app.agent.datadoghq.com/api/v1/series?api_key=<api_key>": dial tcp [2600:1f18:24e6:b901:1af8:1d45:efec:931d]:443: connect: network is unreachable
我在这里找不到一个解释,为什么网络是不可及的,然而。
API_KEY是有效的,因为它正在将其他指标上载到仪表板上,但是我没有看到码头容器日志。
我的不可接受的配置是:
- role: datadog.datadog
become: true
vars:
datadog_api_key: "{{ logs_datadog_api_key }}"
#datadog_site: "datadoghq.com"
datadog_config:
tags:
- "env:{{ datadog_environment }}"
# tags
log_level: INFO
logs_config:
container_collect_all: true
use_http: true
apm_config:
enabled: true
logs_enabled: true
network_config:
enabled: true我也尝试过将use_http设置为true,但仍然没有发送任何内容。是我错过了什么,还是做错了什么?我是否应该将datadog代理作为一个容器化实例运行?
发布于 2022-08-05 07:11:40
最后,我找不到为什么端口不能打开,为什么容器日志被挂起。我确实将本地配置更新为:
- name: Install docker python package
ansible.builtin.pip:
name: docker
- name: Pull Datadog agent image
community.docker.docker_image:
name: "datadog/agent"
tag: "latest"
source: pull
- name: Get current agent container
docker_container_info:
name: "dd_agent"
register: result
- name: Stop agent container if running
docker_container:
name: "dd_agent"
state: stopped
when: result is defined and result.exists
- name: Remove agent container if running
docker_container:
name: "dd_agent"
state: absent
when: result is defined and result.exists
- name: Run Datadog agent container
docker_container:
env:
DD_API_KEY: "{{ api_key }}"
DD_LOGS_ENABLED: "true"
DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL: "true"
DD_HOSTNAME: "{{ host }}"
DD_APM_ENABLED: "false"
DD_DOGSTATSD_NON_LOCAL_TRAFFIC: "false"
DD_PROCESS_AGENT_ENABLED: "true"
DD_CONTAINER_EXCLUDE: "name:datadog-agent"
image: "datadog/agent:latest"
name: "dd_agent"
restart_policy: "unless-stopped"
state: started
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /proc/:/host/proc:ro
- /sys/fs/cgroup/:/host/sys/fs/cgroup:ro
- /opt/datadog-agent/logs:/opt/datadog-agent/run:rwhttps://stackoverflow.com/questions/73230992
复制相似问题