背景:
工欲善其事,必先利其器。作为程序员,想要得到更好的发展,遇到问题直接baidu, google 虽然可以得到一些参考或者答案,但是也会降低自己的思考能力,本文以ubuntu 使用过程中黑屏这一问题为背景,旨在提供一个从零开始完全不借助搜索引擎的问题分析方法。
表像
1. 使用过程中黑屏,屏幕转而显示VGA 无信号
2. 待完善
分析
1, 强制重启后,观查journalctl 查看到是gpu 问题导致
a. amdgpu dma 操作超时
amdgpu: ring sdma0 timeout, signaled seq=102898, emitted seq=102899
b. 超时后又发生gpu reset fail
[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx test failed (-110)
amdgpu 0000:07:00.0: amdgpu: GPU Recovery Failed: -110
c. gpu reset fail, 会触发EE, 有进一步的提示信息
Sep 06 15:29:52 leo /usr/libexec/gdm-x-session[2200]: (EE) Please also check the log file at "/home/leo/.local/share/xorg/Xorg.1.log" for additional information.
Sep 06 15:29:52 leo /usr/libexec/gdm-x-session[2200]: for help.
Sep 06 15:29:52 leo /usr/libexec/gdm-x-session[2200]: at http://wiki.x.org
d. 根据上一步的提示信息,查看Xorg.1.log
[ 18.719] _XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
[ 18.719] _XSERVTransMakeAllCOTSServerListeners: server already running
e. 根据xorg 的描述,初步定位是xorg 或相关软件问题
I keep getting the message: "Cannot establish any listening sockets..."
You get an error message like:
_XSERVTransSocketINETCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
Fatal server error:
Cannot establish any listening sockets - Make sure an X server isn't already running
This problem is very similar to the previous one. You will get this message possibly because the lock file was removed somehow or some other program which doesn't create a lock file is already listening on this port. You can check this by doing a netstat -ln. Xservers usually listen at tcp port 6000+, therefore if you have started your Xserver with the command line option :1 it will be listening on port 6001.
Please check the article above for further information.
下一步排查发向:
1. 发生问题时,观查键盘大小写切换键指示灯是否还有变化。
2. 执行ctrl+alt+f3, 切换到tty 窗口,观查是否有画面,进而拆分是xorg 问题,还是gpu 已经完全不工作