文章/答案/技术大牛

发布

问Ubuntu 16.10过热问题
EN

Ask Ubuntu用户

提问于 2016-10-31 14:56:27

回答 2查看 2.6K关注 0票数 4

我最近安装了Ubuntu16.10，从那时起Ubuntu重新启动自己。last | grep "Oct 31"的输出是：

aegefel  tty7         :0               Mon Oct 31 15:15    gone - no logout
reboot   system boot  4.8.0-26-generic Mon Oct 31 15:14   still running
aegefel  tty7         :0               Mon Oct 31 15:02 - down   (00:04)
reboot   system boot  4.8.0-26-generic Mon Oct 31 15:02 - 15:06  (00:04)
aegefel  tty7         :0               Mon Oct 31 14:33 - crash  (00:28)
reboot   system boot  4.8.0-26-generic Mon Oct 31 14:33 - 15:06  (00:33)
aegefel  tty7         :0               Mon Oct 31 14:12 - crash  (00:20)
reboot   system boot  4.8.0-26-generic Mon Oct 31 14:12 - 15:06  (00:54)
aegefel  tty7         :0               Mon Oct 31 13:08 - crash  (01:04)
reboot   system boot  4.8.0-26-generic Mon Oct 31 13:08 - 15:06  (01:58)

这让我相信这是由车祸引起的

我不知道是什么原因，但当我试着看电影或者做备份的时候

我该怎么做？

编辑1

more /var/log/syslog*命令提供给我：

Nov  6 18:18:17 aegefel-Akoya-E6424-MD99850 gnome-terminal-[2674]: Allocating size to GtkBox 0x55558d2b47b0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
Nov  6 18:18:17 aegefel-Akoya-E6424-MD99850 gnome-terminal-[2674]: Allocating size to GtkBox 0x55558d2b47b0 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
Nov  6 18:18:31 aegefel-Akoya-E6424-MD99850 gnome-terminal-[2674]: Allocating size to GtkBox 0x55558d2b4120 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
Nov  6 18:18:31 aegefel-Akoya-E6424-MD99850 gnome-terminal-[2674]: Allocating size to GtkBox 0x55558d2b4120 without calling gtk_widget_get_preferred_width/height(). How does the code know the size to allocate?
Nov  6 18:18:36 aegefel-Akoya-E6424-MD99850 systemd[1]: Starting Stop ureadahead data collection...
Nov  6 18:18:36 aegefel-Akoya-E6424-MD99850 systemd[1]: Started Stop ureadahead data collection.

然后在将近1分钟内什么也没发生，所以我想pc重新启动了。

ls -alt /var/crash给了我今天的命令：

total 21672
drwxrwsrwt  2 root     whoopsie     4096 Nov  6 14:26 .
-rwxrwxrwx  1 root     whoopsie        0 Nov  6 14:26 .lock

编辑2

只有当我的CPU被使用在40% - 50%或更多(我的CPU是英特尔核心i5 6267U2.9GHz)时，才会追加这个附件。

编辑3

sensors命令提供了以下内容：

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +37.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:         +34.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:         +36.0°C  (high = +100.0°C, crit = +100.0°C)

acpitz-virtual-0
Adapter: Virtual device
temp1:        +38.0°C  (crit = +98.0°C)

pch_skylake-virtual-0
Adapter: Virtual device
temp1:        +35.0°C

高温等于临界温度。也许我的笔记本电脑过热了，风扇没有时间降低温度。我试图降低高温，但这会自动降低临界值(临界值必须等于高)。

编辑4

给你

/var/log/kern.log
/var/log/kern.log 1
/var/log/kern.log2.gz
/var/log/kern.log3.gz

这里是11月20日的坠机事件

编辑5

经过一些测试，我认为问题是GPU过热。事实上，我的笔记本电脑只有在我试着看电影、在笔记本电脑上玩一些免费游戏或使用虚幻引擎4进行测试时才重新启动。我的电脑没有用Blender重新启动的原因是Blender默认使用CPU (而不是GPU)。我有一个Intel Iris Graphics 550 (Skylake GT3e)，知道吗？

crash

reboot

16.10

回答 2

Ask Ubuntu用户

发布于 2016-11-27 23:36:30

如果您真的很担心内核恐慌导致的重启(如文章的标题所示)，您可以检查文件/etc/sysctl.conf是否有类似于kernel.panic = n的指令，其中n是指示延迟多少秒才能在内核恐慌中重新启动的指令。研究表明，默认情况下，它不应该重新启动。

相反，正如我所怀疑的，您更关心的是确定这些重新启动的根本原因(我的看法是一些与硬件相关的故障)，您需要检查机器检查事件，以确定哪些硬件出现故障。如果您没有文件/var/log/mcelog，您可能需要通过启用Universe存储库(如果您的源代码中还没有启用)并发出命令sudo apt install mcelog来安装mcelog包装，然后将这些事件记录到/var/log/mcelog中

为了清晰起见，以下是man mcelog的摘录

X86  CPUs  report  errors  detected  by the CPU as machine check events
       (MCEs).  These can be data corruption detected in the  CPU  caches,  in
       main memory by an integrated memory controller, data transfer errors on
       the front side bus or CPU interconnect or other internal errors.   Pos‐
       sible  causes can be cosmic radiation, instable power supplies, cooling
       problems, broken hardware, or bad luck.

       Most errors can be corrected by the CPU by  internal  error  correction
       mechanisms. Uncorrected errors cause machine check exceptions which may
       panic the machine.

有关mcelog文件格式的更多信息可以找到这里。

默认情况下，由于内核恐慌，Linux系统通常不会重新启动，因此您可能会扩展到检查前面提到的文件/etc/sysctl.conf。

资料来源：

http://www.techrepublic.com/blog/linux-and-open-source/auto-reboot-linux-after-a-kernel-panic/

http://packages.ubuntu.com

硬件错误]：记录的机器检查事件“出现在syslog中.我该怎么办？

http://mcelog.org/logfile.html

根据您的mcelog，CPU的1和3在您的系统是过热的。节流、冷却和节流(所有这些都是为了防止CPU过热)。其根本原因可能是CPU和散热器之间应用不当的热化合物，散热器，堵塞的通风口，或过多灰尘或冷却设备失灵(风扇？)。另一种(不太可能的)可能性是CPU的热检测能力出现故障。

票数 2

Ask Ubuntu用户

发布于 2016-11-27 09:16:17

这个题目的题目不清楚。

无论如何，如果您需要帮助调查您的系统崩溃，而且以前的所有注释都没有用，请尝试以下几个：

增加内核日志的详细性。
停止内核以崩溃/恐慌的方式自动重新启动。
尝试在您的系统中远程登录(例如ssh)并检查日志。
正如@user.dz所说，使用例如来自http://www.memtest.org/的memtest86+来深入检查您的内存。
因为你说“只有当我的CPU被使用在40% - 50%或更多的时候才附加...This.”，会不会是个PSU问题？我的意思是你的系统需要比PSU所能提供的更多的能量。

票数 1

页面原文内容由Ask Ubuntu提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://askubuntu.com/questions/843860

复制

相似问题

问Ubuntu 16.10过热问题
EN

编辑1

编辑2

编辑3

编辑4

编辑5

回答 2

Ask Ubuntu用户

Ask Ubuntu用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Ubuntu 16.10过热问题EN

编辑1

编辑2

编辑3

编辑4

编辑5

回答 2

Ask Ubuntu用户

Ask Ubuntu用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Ubuntu 16.10过热问题
EN