Ever since I upgraded to version 22, I’ve been encountering infrequent and mysterious video and mouse lockups on my Fedora box. Things will be running just fine and then – blam – everything just freezes and the only way to reset is by leaning on the power button for a hard reboot. Sound keeps working, but the everything else becomes unresponsive.
My first instinct, which lasted for about 3 weeks, was to ignore the problem and just live with the resets. Problems like GUI hangs are rarely simple things, there’s usually some version mismatch in a driver or some strange hardware incompatibility, and they often seem to require some deep machine fu to debug and understand. It seemed the only thing I could do was to try to describe and post the problem somewhere and maybe get lucky that someone responds. In my experience, helpful responses are rare, so I kept procrastinating.
But then I thought, even if nothing comes of it, maybe someone would see it and maybe it might end up helping someone. Unlikely, for sure, but possible. And then I reflected that part of being in an open source community is trying to identify and help resolve problems as best you can. How many times have you found questions and answers on forums or StackExchange that helped you figure something out? What if those people had all just figured help was unlikely, shrugged their shoulders, and lived with their difficulties?
So, I took a deep breath and headed on over to the Fedora Project forum to see what I could find out. The deep breath was needed because the first time venturing into a forum is usually a bit overwhelming. There are usually a bunch of sticky threads that you are expected to read through to understand the forum structure, the searching expressions, and the posting rules. Every forum seems to be a bit different and there is little to do about it except just get stuck in and spend the time needed.
Then comes the overwhelming scale of many forums. There are hundreds and thousands of threads, many of them years old, and many of them kind of similar but not the same as your issue. The thing you have to do is start to look for keywords and patterns in both the titles and posts of possibly related threads until you begin to get a feel for the right way to describe and frame your particular problem. Then, you go back and search again, this time looking for more focused results and adding in time filters to keep it to reasonably recent threads. I like a sieve of about 3 months, which is recent enough to weed out solutions to previous software versions but is still recent enough to cast a reasonable wide net.
So, while I originally went in looking for “gui hang”, after some time I realized that “freeze” was much more prevalent and that a large percentage of freeze-related threads were also about “nvidia” driver problems. A quick check of my PCI devices revealed that I did, indeed, have an nVidia video card in my box.
> lspci -k ...snip... 01:00.0 VGA compatible controller: NVIDIA Corporation NV41 [GeForce 6800] (rev a2) Subsystem: NVIDIA Corporation Device 0245 Kernel driver in use: nouveau Kernel modules: nouveau ...snip...
I could see the nVidia card and also that I was using the default Fedora nouveau driver. Many of the threads mentioned that there were, apparently, aspects of nVidia cards that are not accounted for by the nouveau driver, so the recommendation is to use a driver supplied directly by nVidia.
I went to the nVidia site and entered the information for my video card, and was able to figure out that I needed a driver from the 304 series of drivers. You can try to install drivers yourself, but it turns out that they are already packaged properly for dnf install on the nonfree side of the RPMFusion repository (including installation instructions).
> dnf -y update > dnf install akmod-nvidia-304xx xorg-x11-drv-nvidia-304xx > reboot
When I tried to reboot, however, the graphics would no longer start up correctly. It turns out there is one more step required to get the akmod you just installed to be rebuilt correctly. Since the graphics are now busted, you need to ssh into the machine and force the rebuild.
> akmods --force > reboot
At this point, my graphics came up correctly, although the boot process for the GNOME shell seems glitchy, often crashing the shell. Fortunately, if I just wait a moment, eventually everything refreshes and the shell restarts and everything seems to run properly thereafter. I have not had any more screen freezes since I installed the driver, but only time will tell if the issue is properly fixed.
I am also a bit concerned that when I next run a dnf update that I will have to go through some or all of the process again to make the drivers work properly with any newer kernel, but that is material for a future post.