I am very curious and want to help to make Linux more accessible.

I wrote with some people and got some insights:

  • everything text, like a read-mode-only browser or a plain Terminal is best for TTS engines.
  • TTS engines are difficult, some are really good but need many resources, some are worse but save resources
  • TTS needs to be optimized to be really fast in some cases, to keep up with the speed
  • some apps are better, some are worse, but probably most apps dont really suit blind people, as the whole GUI concept makes no sense

I am really curious. How would it be best for you, braille vs. voice, voice input vs braille vs. gestures?

What apps do you find best, how do you browse the web, find media to listen, how do you use Document editors and what purpose do they have for you?

Thanks a lot!

  • @[email protected]
    link
    fedilink
    English
    2
    edit-2
    1 year ago

    Hello, a daily Linux blind user for several years here. @fastfinge I would like to respond to some of your points, as I find them somewhat misleading.

    Sound never, ever works: Alsa, pulseaudio, pipewire…some apps require one, some apps require the other, and they don’t work together at all well. Plus sound is considered part of the userland desktop environment. So if there is some problem preventing the GUI from launching and dropping me into a shell, I can forget about having any accessibility what-so-ever. We’re what feels like decades away from having the advanced features that CoreAudio on mac or WASAPI on windows offer (audio ducking, first-class support for multiple soundcards and routing audio from different apps to different outputs easily, per-app volume adjustment, loopback recording, low latency, any kind of spatial anything, etc.).

    My experience is quite different. Regularly tracking the blind Linux mailing lists and guiding newcomers on getting their systems accessible, problems with audio are actually very rare these days. Mainstream distros like Ubuntu come prepacked with a bunch of drivers, ensuring great compatibility. From time to time someone appears with sound problems, but most of the time it turns out to be a corrupted installation image rather than actual audio issue. I don’t remember anyone who wouldn’t be capable of installing the system because the audio wasn’t working. Alsa is a kernel module responsible for playing sound, PulseAudio is a software mixer built on top of Alsa providing more civilised audio interface for applications, PipeWire is a drop-in replacement for PulseAudio bringing some fresh wind to the field, mostly improving bluetooth headphones/speakers compatibility, but also improving latency etc. I don’t see how these could get in comflict. PW and PA shouldn’t occur on a single system installation, while both are designed to work with Alsa. For applications, it doesn’t really matter, most of them had been developed in times of PA, and PW is backward compatible with it. I’ve fiddled with many applications through the years, and didn’t see a single-one that would have audio problems. As for audio effects, there are very popular PulseAudio modules, PipeWire modules you can use for various effects like noise cancellation, audio filtering etc. I didn’t play with them too much because I didn’t see any added value also in Windows audio effects. I once applied a noise-cancellation tweak trying to improve the quality of my laptop’s internal mic, but well, that thing has so bad sound I eventually decided to stream the microphone from my smartphone to PulseAudio through ADB, and it worked really awesome! And it’s remained working also after switching from PA to PW not needing any changes. So much for audio flexibility and compatibility. Even more interesting effects can be achieved with systems like Jack, but being just an average user in audio terms, for me it’s been never worth setting up. The things you mention like working with streams of specific apps are perfectly possible. Perhaps a more straight-forward interface wouldn’t hurt, but other than that, it works, whatever. :D There had been some controversial topics indeed in the past, like when Alsa devs decided it would be funny to make the audio volume 0% by default and having it muted as well, what didn’t get really appreciated by the blind community. :D But then, on Windows screenreaders are applying workarounds for years to prevent soundcards from auto-sleeping, otherwise the effects are really crazy, with the funniest part being they don’t even work 100% of the time, I had computers where I needed silenzio anyway. So, these things are by no means unique for Linux, I see it just as the general kind of things we’re used to be dealing with.

    meaning I have absolutely no recourse other than “get a sighted person to come to my house and fix it” because I did something as simple as swap a drive.

    As a blind student using Windows, I was regularly calling my classmates to read me the screen, because I did something as simple as trying to turn my computer on. Actually, Windows got the ability to activate the screenreader during system installation just few years ago with Windows 10, Linux could do this long before. And while on Windows I had been pretty stuck, Linux seems way more flexible to me in this regard, at worse I can boot up a live distribution and fix things from there, I don’t remember the last time I booted a Windows Live image. :) That’s not to negate your point, the situation indeed could be better and there are distributions like Slint trying to achieve this. Though, I think it’s also worth mentioning this is a wider topic than Linux, just for how long have we been discussing UEFI accessibility, grub accessibility etc. These problems are related to all blind people, and of all the systems we have today, Linux gives you the most power to do something about them if necessary.

    Linux has no comprehensive, standard, accessibility API that will work cross-window manager, cross-desktop, and cross-platform. In Windows or mac, there are clear guidelines that developers need to follow to create accessible apps. If an app is inaccessible to me, I can refer a developer to first-class, well supported and understood, API documentation telling them what they need to do in order to better interact with my screen reader. On Linux, that’s not so easy, meaning apps just never get fixed, because developers working for free just can’t be expected to figure it all out themselves.

    Linux has at-spi, that’s the universal accessibility standard equivalent to MSUIA on Windows. It’s pretty well developed, works on any setup that has DBus, another standard thing you find basically anywhere, even sandboxing software like Flatpak or Snap respect atspi interfaces, meaning it’s not some random thing supported by few demo apps, it’s a full-fledged accessibility standard used by GTK, QT, web engines, Java, etc. It does have documentation, and even software for helping developers evaluate the accessibility trees of their apps. The argument that apps don’t get fixed because devs don’t understand atspi is something that… is not really the cause of the problems. Just compare with Windows. When was the last time you discovered an inaccessible app, you wrote the developer “Hey, your app is inaccessible, but there is this great, well-documented MSUIA thing you can implement and get it working”, and the developer was like “Awesome, thanks for your feedback!” and made the app accessible? I wouldn’t like this to sound rude or anything, because it’s not meant that way, I just want to point out this is very lowlevel stuff that noone but accessibility experts understand, and it’s a work beyond one person or even a small team without relevant expertise to make applications accessible using these APIs. Not even highly popular GUI libraries managed to implement it properly. Consider Java and QT, GTK never even bothered. While, that should exactly be the way for app developers to support accessibility, to use correctly their GUI frameworks, which should be responsible for handling MSUIA stuff. Many of them didn’t really. That’s why Matt Campbell started the AccessKit project, which is really promising in terms of cross-platform accessibility, he already improved atspi quite a bit and other changes are planned, see your other linked post. That’s why I would say Linux not only has universal accessibility API, it’s also one of the most open APIs we have these days.

    Btw, quite a lot of Linux GUI environment is already accessible, as I mentioned I’m a daily user and things are really on a decent level, but things are just starting to move. It’s not so long ago that KDE selected accessibility to be one of their main goals, and there’s going really a lot of work into making that determination true. Cosmic desktop, the new environment of PopOS written in Rust, is building on top of GUI frameworks working on implementing AccessKit, and System76 committed itself to support the accessibility of the environment. A lot of development is happening in Orca, we’re getting a lot of new features, refactors and bugfixes, improving the experience really significantly. Redhat has hired two people I’m aware of specially for accessibility positions, to work on improving GTK 4 and GNOMe DE accessibility, the work done by the accessibility teams has already shown up in both the environment and the framework. Wayland is a common scarecrow in the whole Linux world, not just the accessibility-one. But the reality is not that bad, there are few annoying minor things we’re workarounding just now, like clipboard being available only to applications that have a window, making flat review copy not working (Orca come up with a feature similar to the Jaws’ virtualisation to fix this). On one hand it’s true it is annoying, right. But on the other, when I realise any Windows program is capable of freely reading the clipboard without any special privileges, and this has already been misused in the past for stealing money by swapping crypto adresses, I’m not sure it’s that clear what’s actually the inconvenience.

    1/2

    • @[email protected]
      link
      fedilink
      English
      21 year ago

      Ok, so much I guess for this post. :D Sorry if it got somewhat exhaustive, Linux is, like many environments unfamiliar to people, subject to a lot of myths and false information, so I thought it may be a good idea to clarify them before discussing where to move next. I shall do that in another post, actually getting to the question of the OP. :D Sorry again. Perhaps one last thing to address, people writing Linux has not been developed for desktop use… well, it’s true, Linux has been developed as a generic purpose operating system, for any use one can imagine. We have really fancy desktops, we have awesome web interfaces, server running folks can enjoy the power of containers, actually not just them, Vanilla OS brought up many interesting and innovative concepts from immutable desktop (what’s sort of a misleading name in certain sense, but whatever), to the principles of their apx manager, leveraging containers to unify software from different Linux distributions and integrating them into the OS. The flexibility is something that other systems can hardly compete with, so yes. We’re not just desktop folks, we have much more in the stack!

      P.S. I almost managed the character limit! 300 characters, what a pain, I hate these restrictions. :D

      2/2

    • Samuel ProulxM
      link
      fedilink
      English
      11 year ago

      To respond to some of your points:

      Regularly tracking the blind Linux mailing lists and guiding newcomers on getting their systems accessible, problems with audio are actually very rare these days.

      And this, right here, is the issue. I don’t need guidance on either Windows or mac. I turn it on, and it works. I can install either Windows or Mac, by default, with a screen reader. The recovery partitions created by both Windows and Mac support screen readers, so I don’t need to keep a thumb drive with a live image around. Windows and Mac updates don’t break audio. They don’t require me to hack around with environment variables to get audio working, or to set accessibility options. I have a machine right beside me that I use as a home server, running the latest Ubuntu LTS. It has standard intel onboard audio, and when I tested just now with the TTY, even though the audio drivers seem to be installed, audio isn’t working. The only way I can use the TTY at all is with an ancient dectalk hooked up with a USB to serial port adapter, and even that took an hour of messing around with modprobe and other nonsense. I have no idea why; as this is a server, I don’t really care, but if it was my desktop (meaning I probably wouldn’t have SSHD going) I’d have no way to fix anything. I can only even test it because I happen to be old enough that I have dusty ancient DecTalk hardware in the cupboard. I can say that of the seven or eight Linux systems I’ve worked on, I have never one single time encountered a system with working audio. There’s probably some special thing I have to do to enable it, but without another computer I can use to look up what that might be, and the awareness of where and how to look, I’ll never know what it is. I’m sure you can tell me. But that’s not the point. The point is that by default, Linux systems are unusable by screen readers, unless you know exactly what to do and how to do it. How many thousands of hours did you spend figuring out how to get Linux systems working for you?

      Perhaps a more straight-forward interface wouldn’t hurt, but other than that, it works, whatever. :D

      And this kind of dismissive response to getting easier interfaces is another reason why Linux won’t be used by most people. I do complex presentations for work, that involve anywhere from three to five audio sources, and I need to change how they’re routed at least once or twice mid-presentation. This needs to be frictionless, as I need to do it while speaking, in less than 15 seconds. My use case isn’t that unusual, especially for blind people with screen readers presenting on Zoom or similar.

      screenreaders are applying workarounds for years to prevent soundcards from auto-sleeping,

      And these can be configured easily, usually work, and are fully supported and understood. None of that seems to be so on Linux, though as I’ve never found a Linux system with working audio, I can’t speak from direct experience.

      As a blind student using Windows, I was regularly calling my classmates to read me the screen, because I did something as simple as trying to turn my computer on.

      On Windows 11? Press control+windows+enter and narrator will launch. Even if you’re at the system recovery prompt. As long as you’re out of the BIOS, narrator will run.

      Just compare with Windows. When was the last time you discovered an inaccessible app, you wrote the developer “Hey, your app is inaccessible, but there is this great, well-documented MSUIA thing you can implement and get it working”, and the developer was like “Awesome, thanks for your feedback!”

      On mac? Last week. On Windows? A couple months ago. I was able to explain to them how the interfaces in Objective C or C# or whatever they’re using support accessibility, and point them at resources to work with it. On Linux, most things still are QT or GTK, and neither system properly supports accessibility without a bunch of hackery.

      Yes, improvements are happening. But they’re at least 10 years behind mac and Windows, and they’re going to require similar types of redesigns of basic building blocks that Windows and Mac required.

      • @[email protected]
        link
        fedilink
        English
        41 year ago

        And this, right here, is the issue. I don’t need guidance on either Windows or mac. I turn it on, and it works.

        So does it on Linux. Perhaps I have misphrased my point a bit, by writing that I’m tracking Linux mailing lists and helping new users to get to Linux, I meant to point out I have experience with different people of different backgrounds, skillsets and setups, with different environments, in other words, it’s not just mine laptop off what I’m claiming that all setups work, I’m just saying, that while working with various people, audio usually works out of the box, and when it doesn’t, it usually turns out to be a corrupted image or something along those lines.

        As for the general guidance for newcomers, this is something you simply do need for anything you want to learn in life. You once had to learn that computers have a keyboard, speakers, how to use them, someone probably told you how internet works, you likely figured out a lot on your own, so did you need to learn how to control smartphones, etc. Each platform, each device is different and when one comes to a new environment, it takes a while getting to know it. That’s completely natural and okay. In order to make this process easier, the community is here to answer any questions, just like on blind tech related mailing lists, I’m daily answering questions from Windows users about their system, because something doesn’t work for them.

        I do complex presentations for work, that involve anywhere from three to five audio sources, and I need to change how they’re routed at least once or twice mid-presentation. This needs to be frictionless, as I need to do it while speaking, in less than 15 seconds.

        I don’t. And no matter how pretty interface may Windows offer, if you told me “Hey, we have a presentation in 30 minutes, I need you to setup these things”, I would have absolutely no idea what to do. Actually, I don’t even think this is supported by Windows on its own, I do remember people working with sound using Virtual Audio Cable for doing this routing magic, but that’s a specialized software you need to get familiar with, study and properly install & configure. When you are already dedicating this time and resources for achieving something, what’s the difference between studiyng VAC and Jack? Actually, you may not even need something as complex as Jack, I think that’s used mostly by musicians and advanced audio engineers who need super-low latency and changes to the way system routes audio chunks, for just routing streams, perhaps PulseAudio already offers what you need.

        And these can be configured easily, usually work, and are fully supported and understood.

        Most of things can be configured easily if you understand them. Learning curve is a natural part of anything anyone is new to. Well, Linux is unfortunately not taught on elementary schools, not preshipped on 99% of laptops, it’s not the matter of 99% tech discussions on non-tech forums. If it was, it would generally be well understood how to setup an accessible Linux installation, and few adventurers trying out “the mythical Windows” would be complaining about how many hours their spent debugging until they found out that the high latencies of their screenreaders are not caused by low system responsibility but sleeping soundcards, and perhaps they wouldn’t even figure, they would just conclude that Windows accessibility API is bad and they would return to the well-understood and established Linux. So much depends on the background and social awareness, that some skills seem like completely natural and obvious, while they actually had to be acquired, it’s just that when everyone does it, the path appears to be more straight-forward and natural, because you don’t need to ask how to do things, since you see everyone else doing and talking about them, and in the process of following, you don’t even realize how much you learn.

        Does Linux do everything it can to communicate itself for the blind? Certainly not! There is already a lot of material for sighted folks, blind people can find some clues, but they’re usually either minimal, can be easily out of date or even straight-forward wrong, we could certainly do much better in this regard as a community. Currently, the best and most uptodate support is on our community mailing lists, where new people come to ask things like What is Linux, Which distribution to choose etc. and we help them as we can.

        But I think it’s important to recognize this is a communication problem, not technical problem. If we wanted to make Linux work exactly like Windows, so people wouldn’t need to learn anything new, well, why would they want to try a different system then, they could just stay with Windows. Linux is awesome, Linux is cool, but it’s Linux, not Windows. That’s a feature, not a bug.

        On Windows 11? Press control+windows+enter and narrator will launch. Even if you’re at the system recovery prompt.

        Awesome. Now what’s the point, if all the prompt offers me to do is a recovery that fails in the end or restarting my PC? I mean yes, if the recovery is successful, fair enough. But then, in 4 years of my usage of Linux, the system never, ever failed to boot. And in case it did, it’s likely for a reason that couldn’t be handled by auto-recovery. I’m regularly getting upset when using Windows terminal even in standard environment because NVDA’s flat review never works as expected, overal. Fixing a broken installation through a tty using Narrator sounds like a nightmare, even if it was actually possible.

        On mac? Last week. On Windows? A couple months ago. I was able to explain to them how the interfaces in Objective C or C# or whatever they’re using support accessibility, and point them at resources to work with it.

        Well, so could you point them to atspi-resources and documentation, there are resources, Python libraries, Rust libraries, C libraries, I communicated several accessibility bugs with Flutter devs and they implemented the necessary interfaces, so these are fixed now, as far as someone has the expertise and people to work with these things, it doesn’t matter whether they use MSUIA or atspi.

        On Linux, most things still are QT or GTK, and neither system properly supports accessibility without a bunch of hackery.

        that’s not really true. In case of GTK, as far as you use proper components, accessibility works just fine, and that’s the situation with all accessibility aware GUI frameworks, on all platforms. QT is a little bit more complicated, since it’s notoriously known for its inaccessibility even with proper usage, but there are apps that work really well with QT actually, like KeePassXC, so again, it comes down to individual apps. Many things these days are Electron or Tauri anyway, and those work very nicely with Orca, so the field is rather diverse. And even if something does not work, there is usually a TUI for it, which is 100% accessible.

        • Samuel ProulxM
          link
          fedilink
          English
          11 year ago

          there is usually a TUI for it,

          Average users just aren’t going to use a TUI. Heck, I work at a tech start-up, and I avoid TUI’s when I can. I have work to get done, and that work doesn’t involve fiddling with my machine. Both Windows and mac let me get the job done with a minimum of messing around. I work in tech all day; my objective is to get everything done with as little effort as possible not directly related to my actual job. Linux does that in the case of servers, not in the case of desktops. This instance is hosted on an Ubuntu machine, but it’s dockerized and automated. I need to log into it maybe once a month. I have nothing against Linux, but I still maintain that the basic decisions behind it make it totally unsuitable for an accessible desktop for the vast majority of users, without major infrastructural changes.