我有个大问题。如果我不重新启动节点,我的一些基于Proxmox的LXC容器在2天后就没有响应。
这种情况总是在晚上的同一时间发生(我猜在一个容器上发生了一些事情,导致了沉重的负荷)。
问题是:top/atop/htop没有显示任何东西。proxmox节点对ssh命令没有问题,但5个节点中有2个节点没有真正响应(我可以使用SSH登录,但不能输入命令)。
我还必须进行“硬”重新引导,因为重新启动不起作用(LXC-容器在40分钟后不会停止)。
这是我的PVE版本:
pveversion -v
proxmox-ve: 4.1-39 (running kernel: 4.2.8-1-pve)
pve-manager: 4.1-15 (running version: 4.1-15/8cd55b52)
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-2.6.32-43-pve: 2.6.32-166
pve-kernel-4.2.8-1-pve: 4.2.8-39
pve-kernel-4.2.2-1-pve: 4.2.2-16
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-4.2.3-2-pve: 4.2.3-22
lvm2: 2.02.116-pve2
corosync-pve: 2.3.5-2
libqb0: 1.0-1
pve-cluster: 4.0-33
qemu-server: 4.0-62
pve-firmware: 1.1-7
libpve-common-perl: 4.0-49
libpve-access-control: 4.0-11
libpve-storage-perl: 4.0-42
pve-libspice-server1: 0.12.5-2
vncterm: 1.2-1
pve-qemu-kvm: 2.5-9
pve-container: 1.0-46
pve-firewall: 2.0-18
pve-ha-manager: 1.0-24
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u1
lxc-pve: 1.1.5-7
lxcfs: 2.0.0-pve1
cgmanager: 0.39-pve1
criu: 1.6.0-1不幸的是,日志没有显示任何东西。
Syslog:
Mar 15 04:32:31 server pvedaemon[4061]: worker exit
Mar 15 04:32:31 server pvedaemon[1192]: worker 4061 finished
Mar 15 04:32:31 server pvedaemon[1192]: starting 1 worker(s)
Mar 15 04:32:31 server pvedaemon[1192]: worker 24675 started
Mar 15 04:33:05 server pvedaemon[6601]: worker exit
Mar 15 04:33:05 server pvedaemon[1192]: worker 6601 finished
Mar 15 04:33:05 server pvedaemon[1192]: starting 1 worker(s)
Mar 15 04:33:05 server pvedaemon[1192]: worker 25112 started
Mar 15 04:34:57 server systemd-timesyncd[559]: interval/delta/delay/jitter/drift 2048s/+0.000s/0.021s/0.001s/+1ppm
Mar 15 04:36:08 server pveproxy[17238]: worker exit
Mar 15 04:36:08 server pveproxy[1212]: worker 17238 finished
Mar 15 04:36:08 server pveproxy[1212]: starting 1 worker(s)
Mar 15 04:36:08 server pveproxy[1212]: worker 28231 started
Mar 15 04:39:48 server pvedaemon[572]: worker exit
Mar 15 04:39:48 server pvedaemon[1192]: worker 572 finished
Mar 15 04:39:48 server pvedaemon[1192]: starting 1 worker(s)
Mar 15 04:39:48 server pvedaemon[1192]: worker 31498 started
Mar 15 04:40:40 server pveproxy[31690]: worker exit
Mar 15 04:40:40 server pveproxy[1212]: worker 31690 finished
Mar 15 04:40:40 server pveproxy[1212]: starting 1 worker(s)
Mar 15 04:40:40 server pveproxy[1212]: worker 32442 started
Mar 15 04:45:02 server pvedaemon[25112]: <root@pam> successful auth for user 'root@pam'
Mar 15 04:46:27 server pveproxy[28231]: worker exit
Mar 15 04:46:27 server pveproxy[1212]: worker 28231 finished
Mar 15 04:46:27 server pveproxy[1212]: starting 1 worker(s)
Mar 15 04:46:27 server pveproxy[1212]: worker 5082 started
Mar 15 04:48:45 server pveproxy[17122]: worker exit
Mar 15 04:48:45 server pveproxy[1212]: worker 17122 finished
Mar 15 04:48:45 server pveproxy[1212]: starting 1 worker(s)
Mar 15 04:48:45 server pveproxy[1212]: worker 6924 started
Mar 15 04:51:28 server pvedaemon[25112]: worker exit
Mar 15 04:51:28 server pvedaemon[1192]: worker 25112 finished
Mar 15 04:51:28 server pvedaemon[1192]: starting 1 worker(s)
Mar 15 04:51:28 server pvedaemon[1192]: worker 9770 started
Mar 15 04:51:38 server pveproxy[32442]: worker exit
Mar 15 04:51:38 server pveproxy[1212]: worker 32442 finished
Mar 15 04:51:38 server pveproxy[1212]: starting 1 worker(s)
Mar 15 04:51:38 server pveproxy[1212]: worker 9911 started
Mar 15 04:52:45 server pvedaemon[31498]: worker exit
Mar 15 04:52:45 server pvedaemon[1192]: worker 31498 finished
Mar 15 04:52:45 server pvedaemon[1192]: starting 1 worker(s)
Mar 15 04:52:45 server pvedaemon[1192]: worker 10794 started
Mar 15 04:55:46 server pvedaemon[24675]: worker exit
Mar 15 04:55:46 server pvedaemon[1192]: worker 24675 finished
Mar 15 04:55:46 server pvedaemon[1192]: starting 1 worker(s)
Mar 15 04:55:46 server pvedaemon[1192]: worker 13187 started
Mar 15 04:57:32 server rrdcached[972]: flushing old values
Mar 15 04:57:32 server rrdcached[972]: rotating journals
Mar 15 04:57:32 server rrdcached[972]: started new journal /var/lib/rrdcached/journal/rrd.journal.1458014252.151024
Mar 15 04:57:32 server rrdcached[972]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1458007052.150971
Mar 15 04:57:40 server puppet-agent[14639]: Finished catalog run in 0.53 seconds发布于 2016-03-29 20:51:45
lxcfs: 2.0.0-pve1 1有一个让容器挂在内核中的错误。
我已经通过更新到lxcfs: 2.0.0-pve2来解决这个问题。看看这里:
https://forum.proxmox.com/threads/proxmox-4-0-lxc-containers-network-unstable.26353/
发布于 2016-03-24 08:34:31
我们运行的内核与您拥有的内核相同,并且LXC容器将完全挂起。同一主机上的KVM机器仍在运行。它可以是什么,以及如何使LXC容器在不重新启动主机的情况下再次响应?
即使在主机上运行以下命令,也不会继续运行:
pct输入ID
https://serverfault.com/questions/763798
复制相似问题