在我的一台GCP服务器上,google云操作代理出了问题。代理用于日志的Fluent位写入了太多错误日志。三天来,它有88 GB,而在我们已经清理之前。我不知道原木到底是什么意思。有人能帮忙吗?
root@***:/var/log/google-cloud-ops-agent/subagents# tail -50 logging-module.log
[2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance在重新启动fluent-bit.service之后,它启动了无穷大的运行和下降,并重复了:
root@***:/var/log/google-cloud-ops-agent/subagents# tail -300 logging-module.log
[2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.1] metadata_server set to http://metadata.google.internal
[2022/02/15 18:15:46] [ warn] [output:stackdriver:stackdriver.1] client_email is not defined, using a default one
[2022/02/15 18:15:46] [ warn] [output:stackdriver:stackdriver.1] private_key is not defined, fetching it from metadata server
[2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] worker #7 started.
[2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238945.234513362.flb
[2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238950.216326541.flb
[2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238953.150198939.flb
[2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238957.150224348.flb
[2022/02/15 18:15:46] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 18:15:46] [error] [engine] could not segregate backlog chunks
[2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] thread worker #0 stopping...
[2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] thread worker #0 stopped
[2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] thread worker #1 stopping...重新启动google-cloud-ops-agent-opentelemetry-collector.service和google-云操作代理服务没有帮助。知道它为什么会发生吗?原木意味着什么?
发布于 2022-03-13 01:46:23
您没有提到正在经历此问题的版本,也没有提到您是否已从早期版本升级,但在2.7.1之前的Ops代理版本中存在一个错误,导致缓冲区损坏,这在以后的版本中表现为您引用的错误(“格式检查失败”)。解决方案是删除损坏的文件,直到代理正常运行为止。有关详细说明,请参见公共问题追踪器。
https://stackoverflow.com/questions/71131541
复制相似问题