Hacker News new | ask | show | jobs
by olalonde 3520 days ago
I'm not blind but here's an (unsolicited) project idea for you.

To be candid, I have no idea what it feels like to be blind and have never paid much attention to accessibility other than reading a tutorial or two and making sure I use alt tags on my images. The main reason for that is that I'm lazy and based on my experience, most developers are in the same boat.

Now, if there was a service which would spin up a remote VM session inside my browser (a bit like BrowserStack or SauceLabs do) with all screen reader software setup and no screen (only audio), it'd make it a lot easier for me to experience my software as a blind user. There should probably also be a walkthrough to help new users use the screen reader and help them get started. If you're lucky, you could turn this into a business and it could indirectly help you achieve your goal of making better software for the blind by exposing more of us to your issues.

Anyways, I know you probably have more pressing issues to solve and I hope I didn't come across as arrogant, just throwing the idea out there.

9 comments

As a partially sighted developer specializing in assistive technology, with several friends who are totally blind, I think this is an excellent project idea.

The cheapest way to do it would probably be using the Orca screen reader for GNU/Linux, probably combined with the MATE desktop (forked from GNOME 2) so one doesn't have to worry about 3D acceleration in the VM, which will presumably be hosted remotely on a cloud provider somewhere. The main technical challenge that springs to mind will be capturing all keyboard events in a browser window. This is particularly important because screen readers tend to rely on esoteric keyboard commands, which repurpose keys like CapsLock and Insert as modifiers. I don't know if this can actually be done in a normal web browser.

Anyway, just throwing out my quick thoughts on this. I don't currently have time to pursue it further myself.

> I don't currently have time to pursue it further myself.

I work for a non-profit where we tackle accessibility issues related to the web, documents, and tech in general. We have a few Vagrant boxes that we use for development and testing, one of them is a Fedora box (GNOME 3 though) that comes with Orca configured [1] so that it doesn't prompt you for setup options. Chrome and Firefox are installed as well. If you have Vagrant and VirtualBox installed you can make use of it like so:

    vagrant init inclusivedesign/fedora24 && vagrant up
The box is ~2 GB. This is the repository for the box in question:

* https://github.com/idi-ops/packer-fedora

* https://atlas.hashicorp.com/inclusivedesign/boxes/fedora24

We track Fedora releases and update boxes fairly regularly so there should be a Fedora 25 one with Orca once there's an official release upstream.

I hope it can be of use to anyone here. If you have any questions we hang out in #fluid-work on Freenode.

[1] https://github.com/gpii-ops/ansible-gpii-framework/blob/mast...

Do you think Fedora is a usable option as a totally blind person? I'm a totally blind software developer and looked at Linux several years ago. It didn't "just work" and since I already have Jaws for my job which requires Windows I never bothered trying to use the Linux GUI for an extended length of time. I'll have to look at this.
I'm blind and prefer Fedora because it tracks GNOME more closely, and GNOME seems to reliably get accessibility right whereas Ubuntu/Unity doesn't.

So admittedly my development workflow isn't super high-tech. I do lots of JavaScript, some Rust, and a few others. All my languages have reliable command line tooling, which of course works well under Linux.

Some blind folks advised me to try Windows because it was supposed to make me more productive. I tried it for about a year and a half. I've used Linux since Slackware96, and whenever something failed under Windows I was stuck googling error codes and tracking down system logs. I can launch a Linux system upgrade from the command line. If it fails, it fails for an obvious/searchable reason, and prints its failure cause in the terminal. I don't have to track down logs in non-standard locations, google odd hex codes, etc.

Under Windows, the best I could find for accessible JS/Rust dev was Notepad++. That's just an enhanced text editor. At that rate I might as well use Gedit/Vim under Linux for development, which I do and it works well.

If you're developing heavily in Windows specific tech, then Linux wouldn't be a good fit. But as a technical user I'm quite happy with Linux generally, and Fedora specifically. About the only accessible Windows things I miss are audio games and Netflix, and my VM satisfies most of those. There are corner cases where Orca/Firefox act up, but under Windows there were lots of cases where I fought the OS, so there's just no perfect solution. I'd take a stronger foundation over slightly less accessibility any day.

Windows 10 just works and now we have WSL (i.e. bash i.e. run all your favorite CLI tooling). In general, Microsoft has gotten a lot better about Windows just working. Perhaps Linux has too, but you still had to make sure you had fully Linux-compatible hardware and then do extra steps to get Wi-fi, last time I played with it.

Also, the web browsing experience on Windows is so much better, and the audio stack doesn't fall down at the drop of a hat because you edited a config file wrong (hope you have someone sighted who knows how to unedit it for you). I'm not sure I'd call Linux a stronger foundation; this was not at all my experience with it. OS X is, but then desktop Voiceover sucks to the point where you can't really program with it (basic things like terminal do odd things, nevermind the 10 or so keystrokes needed to navigate from code to the project explorer in Xcode. And we have to mention the speech latency). Then they just killed the function keys, which is an additional problem knocking OS X off the list.

But I think the biggest thing about Windows for me is that it's got synths which are capable of being intelligible upwards of 800 words a minute. Linux didn't even let you get at these settings via Orca last I tried it, and you can't set the inflection either, so it never emphasized punctuation. When your interface is linear and top-to-bottom, the biggest bottleneck in the general case is how fast you can go with the synth, and any platform which significantly cuts this down is therefore not a winner in my book.

But whether or not you agree with my points, I consider it pretty clear-cut that only a blind programmer even has the option of trying Linux in the first place, and certainly not a new one at that. You need too much knowledge to have even a halfway decent experience. In terms of making things accessible and having them matter, you've got to hit Windows first.

I didn't know that Fedora's accessibility was better than Ubuntu's. I had pretty much stopped using Linux on the desktop because Orca in Ubuntu just left me wanting more. But I'll definitely give Fedora a try now. Thanks.
Do you know if Eclipse works for Java programming with Orca? That's what I use for my job so could not use Linux as my primary OS if it does not work.
This isn't going to answer your question but the reasons why we support this Fedora VM image is because our projects make use of it in CI environments and also because one of our team members worked on the GNOME screen magnifier. I saw your question regarding whether Eclipse paired with Orca is a viable, accessible option on Fedora for Java development -- I don't use Eclipse but I can ask around and get back to you. I would suggest that if possible you could try Eclipse in the Fedora environment I mentioned with a project you're familiar with; I can help with at least installing and configuring it but ultimately customizing something like Eclipse is a personal and soul searching experience :P

BTW, we provide a Windows 10 Vagrant box [1] as well. I just didn't mention it because it doesn't come with NVDA or the evaluation version of JAWS yet. That will happen soon though.

[1] https://github.com/idi-ops/packer-windows

> I work for a non-profit where we tackle accessibility issues related to the web, documents, and tech in general.

What's the non-profit? Expose UX [1] is preparing an episode focused on accessibility for Global Accessibility Awareness Day: startups will get judged by UX judges based on the accessibility of their products. Would love to connect with your org.

[1] http://ExposeUX.com

It's the Inclusive Design Research Centre at OCAD University. My email is in my profile, feel free to reach out :)

http://inclusivedesign.ca/research/ocadu/

This is really cool, thanks for sharing. Will try it out.
Very nice resource, thanks for putting this together and sharing it.
I feel like you and I have had a vaguely related conversation before, but do any of the remote desktop access protocols support piping sound in addition to graphics? That might be a quick-and-dirty way to prototype something. I've been looking for a reason to play with Elixir/Phoenix, and if there's interest in this then I may try some sort of one-click Orca VM that pipes everything back to the browser. Interesting idea.

Also, I feel like there was an early version/prototype of NVDA Remote that ran in the browser. I remember going to a page, turning on forms mode or whatever NVDA calls it (I've been out of the NVDA loop for a while) and I could send keys/get audio from the remote machine. I think that was before the addon was available so I'm pretty sure it was web native, but I could be misremembering. I don't think there's anything preventing transmission of the Insert key, at least. Capslock or other esoteric modifiers may be trickier.

Hmm, I don't remember any such prototype/demo. However, the NVDA remote protocol is pretty simple, JSON messages over a TCP socket. Putting those messages on a websocket should be fairly straightforward. With the web speech API landing in Firefox as well as Chrome[0] you could even synthesize the speech at the client. There would still be the problem of sending special key sequences. Not just ins/capslock, but also shortcuts that are normally capturen by the browser/OS might get tricky.

Feel free to contact me if you want to develop an NVDA remote server in Elixir. I need a real project in Elixir to do more work in the language. I did some small Elixir projects and like it a lot.

[0]: https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_...

I'm pretty sure the NVDA Remote that runs in the browser still exists. You have to have the add-on on the target machine, however.
RDP can carry audio.
It can, but not with a suitable latency for actual use. What you have to do is send the speech and ersynthesize on the far end. There's an NVDA issue for supporting RDP channels [0], but NVDA Remote made this very low priority and it's potentially blocking on another one to refactor how NVDA handles speech.

0: https://github.com/nvaccess/nvda/issues/3564

Even better would be to get something that can run in a docker container that you can just download and press go!
But that's a lot harder to monetize!
There's a guy who has a book on computer vision, and he sells more expensive versions that come with test images, so it might actually be worth it.

See- https://www.pyimagesearch.com/practical-python-opencv/

I've been working on a search engine for lectures (https://www.findlectures.com) and accessibility is an area I'd like to explore, but it's tough to know how to test something as an end user would see it.

This is really interesting. I've often thought about a browser extension that could describe a web page. I don't mean text, as screen readers are (usually) more than capable there. I mean a visual description like:

"The web site has a off white background with black text. There is a horizontal menu at the top that fills the screen 95% horizontally and 10% vertically. The horizontal menu has a navy blue background with white text. There is a logo on the left of the horizontal menu filling 25% of the screen horizontally and 5% vertically. There are five menu items to the right of the logo image in the horizontal menu."

That's probably a little more verbose than it needs to be. But it could be a combination of CV, tapping into the renderer and traversing the DOM in order to best describe a page. Then if my screen reader isn't cutting it and I'm feeling lost I can just have the current view described to me.

> The web site has a off white background with black text

I'm genuinely curious noe: how would the descriptions of color help?

That's a really fascinating idea!
just close your eyes !?
I know you can do that for blindness, but the challenge I see is gaining a good understanding of how people would actually navigate a site. For instance, perhaps someone who is legally blind might use a screen reader, or just magnification, depending on their preference.

From what I've read, screen readers are typically played at a very high speed (once you get used to it) - I don't know how you'd know things like that without advice from the people using them.

The second question is how you're supposed to navigate if you can't see the text well or at all - this might require adding features to the HTML to integrate with the screen reader.

I can visualize what my website looks like already, so if I shut my eyes I'd just be navigating it in my head.

Orca is a bad idea. Very few blind people use Linux because Linux desktop sucks, and Orca is always at least somewhat behind all the other screen readers. I have met a few blind people over the years who have used it for an ethical reason or because they can't afford Windows, but no one really likes it. Linux is indeed the cheapest option, but it's also almost completely ineffective in terms of making it actually work well for your users.

For starters, even getting low latency out of the Linux audio stack is a major headache, and the synth situation is abysmal. You can't even touch the config files for these yourself because if you break either--even temporarily--you now can't use the computer. Then you get into how all the graphical desktops have accessibility issues to one degree or another and how you have to use a separate screen reader for anything outside them.

What you want if you want a testing VM that actually has value is Windows and NVDA. NVDA is free in both senses and kind of the industry standard for sighted testing now. Jaws is still more popular, but this is slowly shifting in NVDA's favor. This would be because Jaws costs roughly $1000 per user. The advantage of NVDA is that you can be sure all users can have it, and if it works with NVDA then it's very likely to work with jaws without too much more work.

But sadly you can't just test with one; in the end you have to test with all of them. Things like aria are nice if used correctly, but the aria spec doesn't say much about what screen readers have to do, and no one implements 100% of it. It's very close to the situation with needing to test on multiple browsers.

You aren't coming off as arrogant at all. I've thought a lot about this, largely because I'm faced with it every day. There are probably two ways to approach the accessibility problem:

1. Make the inaccessible accessible through clever tools, relying on CV or similar. 2. Make tools that reveal to sighted devs how accessible their software really is, much like you've described.

These two approaches tackle the same problem from opposite ends and would hopefully meet somewhere in the middle. I view #1 as empowerment, getting back abilities that one has lost (or never had to begin with). I view #2 as awareness, giving the sighted visibility into where their software falls short in terms of accessibility.

I'm not sure which would have the biggest impact. But in my mind, I'd rather be empowered by technology. I'd be curious what other blind devs think, though.

I agree that it's good to make our own accessibility where possible. However, computer vision seems like gross overkill for making computer software itself accessible. After all, the information you need is already in the computer; it's just not yet being exposed to the screen reader in a standard way. So processing a screenshot or image from a camera would be very wasteful, though it would work as a last resort. But other hacks to make software accessible? Absolutely! For example, one can imagine writing bits of JavaScript code to fill in gaps in the accessibility of specific websites and applications.
That is indeed a very good idea.

I have been blind in the past for months due to an accident when I was a kid. Fortunately I was lucky enough that a brilliant professor was able to restore partial eye sight. Enough for me to be independent and to be a software developer by profession and traveling the world whenever I get a chance.

One of the things I found out was that it is very hard to explain to people to tell them what it is like to not see. One of the popular questions was "So what could you see?" Don't get that question often these day, but I generally asked them to think at how much they can see with their hands.

When you can't even imagine how it is like, going that one step further on imagining how blind people are able to navigate your application/web site is a step beyond even that. Right now you only have things like web accessibility standards and tests for that. It helps, but it is not the same as "navigating the app like a blind person".

If there's an easy way to test and experience your app/website then it will also be easier to get a requirement like that past a C-level exec.

Sorry that I don't have much to add at this stage as I'm in the middle of starting up a new product myself, but you are always welcome to contact me (contact info is in the profile)

The iPhone includes a VoiceOver[1] feature called Screen Curtain which keeps the digitizer on, but turns off the screen. This helps blind users save a lot of battery and has helped me, as a developer, experience using my app as if I was blind.

[1] http://www.apple.com/accessibility/iphone/vision/

For web apps you can install the tota11y [1] browser extension and use its experimental Screen Reader Wand feature to get an idea of how a screen reader will interpret elements.

tota11y uses Chrome's Accessibility Developer Tools. Deque maintains browser extensions that use their own open source engine [2]:

http://www.deque.com/products/axe/#aXeExtensions

Both engines could also be used in CI environments to perform a11y audits. That should help web developers target at least low hanging fruit.

[1] http://khan.github.io/tota11y/

[2] https://github.com/dequelabs/axe-core

> For web apps you can install the tota11y [1] browser extension and use its experimental Screen Reader Wand feature to get an idea of how a screen reader will interpret elements.

An idea, perhaps, but that's all it would be rather than real-world data. Blind people don't actually use these tools.

More useful perhaps to spin up a blind person to remotely use your app or site? A sighted developer who only has experience using screen readers with one app is going to be like a chef with no taste buds.
Maybe I'm an idiot and completely missing something here, but couldn't this essentially be accomplished by enabling screen reader software on your own computer and turning off your monitor?
You could easily extend something like this to simulate the various age related sight degenerations, partial sight and the varying stages of blindness. That's as well as the high incidence of sight issues with type 2 diabetes, and other common conditions.

By the time someone reaches 50 there's a good chance a proportion of text on their phone, web, computer and even groceries that is becoming unreadable without glasses.

Most app developers haven't a clue. Most of us 50 somethings hadn't a clue 10 years ago! Ctrl + in a browser is a brute force solution. Android's is even worse and a lot of what you want zoomed simply isn't, but it enlarges the parts you can read just fine.

Compared to many of my age I'm lucky and rarely need glasses, but already it's very annoying!

Something like this could be as helpful as when I first saw colour blind simulators 20 years or more ago.

Fair question. On macOS, you can enable VoiceOver by simply pressing Command+F5, and there's even a built-in tutorial. Windows has a built-in screen reader called Narrator, which you can enable with Windows+Enter, but I don't think anyone seriously uses that for their daily work yet. So to get the real experience on Windows, you'd have to head to http://www.nvaccess.org/ and install NVDA. But that's also pretty straightforward.
The best way to evaluate software as a sighted user is to learn how to use the assistive technology with the screen turned on.

Sure you can play around with the screen turned off to get some sense for the experience, but with the screen turned on you can compare the visual experience with the non-visual experience.

Another issue with testing with the screen turned off is - if an element isn't accessible, how are you going to know that it should have been there...

To add, and perhaps this is a terrible idea due to naivete on my part, but if there are different voices/speakers on a page, make a special tunable stream where you can "listen" to all convos but tune in to the interesting ones, kind of like we do when at a gathering and overhear lots of convos but walk up to the ones which pique us and listen and even pipe in...

The tunable part should be made easy to do from a user perspective via some kind of "dial" "mechanism"