rykarn
Wayland breaks the tools I use to make a living
2025-01-26

This was originally posted on cohost on 2023-12-21.

If you are not familiar with the things discussed in this post - Wayland and X11 are display systems on Linux, where Wayland is a modern replacement for X11. Their purpose is to provide ways to draw windows and graphical UIs. Without them, you are basically confined to operating your computer from the command line.

Please note the disclaimer at the end that my technical understanding of the workings of Wayland and the challenges it presents to Talon Voice was limited at the time of writing. I sort of get it a bit more since then, but I’m still by no way confident in explaining the technical details how the accessibility problem on Wayland is structured. But I get the gist of it - the common Wayland protocols do not provide a complete interface to all the things an accessibility tool like Talon Voice needs in order to assist the user in using their computer’s UI.


Major linux distros are soon switching to versions of desktop environments that use Wayland instead of X11. This is a bad state of affairs for accessibility.

I use Talon Voice. It is a program that runs in the background that listens to spoken commands in your microphone, and performs certain actions on your computer based on what you say. I started using it earlier this year when I got RSI in my wrist which made it hard for me to type on a keyboard and use a mouse. My wrist has healed up again and I am once more capable of typing on a computer (this text for instance). But my main source of typing is when I am programming, something I do for a living, and for that I still use Talon and the Cursorless extension. I could probably move back to using emacs with vim keybindings, the way I used to before my wrist issues, but I would prefer not to. Both because I would be worried to injure my wrist once more, and also because I have grown really fond of programming with Cursorless. I am as fast if not faster with voice coding compared to my ability to program with a keyboard (with a typing speed of about 85 WPM). Still, I am lucky that for me, Talon Voice is an option and not a necessity for interacting with a computer. Many others in the Talon Voice community do not have that privilege.

The Wayland protocol lacks (by design) a whole bunch of features needed for Talon and similar accessibility programs to function, things that are fully possible on Linux with X11, OSX and Windows. Things such as querying what windows are available, their titles, what window is focused, emulating keyboard input, querying for the mouse position. At the moment, these APIs would need to be provided by whatever Wayland compositor happens to be in use on a user’s system. Since no common accessibility interface for these tasks exists between the currently available compositors you then run into the problem of getting the development teams behind the different compositors to agree on a standard. That, or require the single developer behind Talon to write his own compositor in order to support Linux.

This really fucking bums me out. For my home computer I’m good for a while because I use a distro where I can decide whether to use X11 or Wayland. At my current work we use a LTS release of a distro still on X11 with the intention of staying on it for as long as it is supported. So I’m good for now, but I do not look forward to potentially having to choose between ditching Talon Voice or ditching Linux altogether.

I’m sure that Wayland is a necessary step forward to get rid of the technical debt of X11, but moving forward now will leave the state of accessibility on Linux, and thus the people depending on it, behind. There are ongoing efforts to compile the issues faced by Talon Voice and contacting the teams behind the big distros and desktop environments, but any change that hopefully might occur from that is still a long way out.

(Note that I have no real insight into the nature of Wayland/X11 and how compositors work and so on, so my description of the technical challenges should be taken with a grain of salt. This is mostly based on what I have pieced together from a high level view of the Wayland architecture and the things mentioned by the Talon developer in the Talon community Slack channel.)


2025-01-26: As far as I can tell, the situation has not really changed much. There are a few experimental or non-core protocol extensions that provide subsets of the features that Talon Voice needs to function, but the actual existencences of an implementation of those protocols are fragmented across the multitude of compositor families that exist today. To me, it really looks like an impossible task to create an application like Talon Voice that can work in a Wayland compositor-agnostic way for the forseeable future.


Like this post
Note: Likes are manually approved and can take a while to appear. Please be nice.