CMOS #5 Lucy Greco - DictationBridge - helping the visually impaired
Interview with Lucy Greco about DictationBridge an Open Source tool to connect screen readers and speech-input software to help the blind and visually impaired.
We also talked a lot about accesibility in general and accessibility of web sites in specific.
Lucy Greco
Links
Transcript
It doesn't just read text, it reads things like buttons, controls, menus. It gives me the ability to say Bring me to the menubar and read the items on the menu.
It gives me the ability to move through a screen and find and discover things like buttons, controls, edit-fields, and hear text as I'm typing it, or be able to navigate an interface on a computer as effectively as a person who can just look at it and point at it with their mouse, I can actually give one of many, many commands to do any of the things I need to do, to be as effective as a sighted person on a computer.
This is a much more powerful thing and the one I'm using is actually an open source screen reader called NVDA, which stands for Non-Visual Display Access, and it's been around for about ten years and it's a really fantastic screen reader. I'd say right now it's probably one of the best ones on the market if not the best one on the market.
And the way to do this is by following the W3C Standards. W3C web-accessibility guidelines 2.0AA is an international standard that, if you follow the rules within there, and if you follow the guidelines and best practices, your website will be much more effective for everyone. Not only for people with disabilities, but we find that when a website actually is accessible to persons with disabilities, it's actually more usable for everyone. When you start getting to fancy widgets and fancy divs, that you're using custom code that's not really good, pure HTML5, you end up making something that might be a little hard for the person with the disability and also we find that the average human being has problems with them, as well. That's my push for standards.
There's one called the WebAIM toolbar and WebAIM, which stands for Web Accessibility In Mind, is a fantastic tool. It also runs in Chrome, I think currently the FireFox version is not there yet, I think they have some problems with the FireFox version but it's supposed to be back any day now.
There's just hundreds of tools out there on the internet. All you have to do is look for web accessibility and you will find tons of pages, start with the W3C and work your way through, they have actually a list of tools. A majority of these tools are free, so they're things that you can incorporate into your workflow for free.
One of my favorites is a tool called Tenon, it's actually a very, very powerful tool and incorporates into many, many, I would almost say all, development environments and it's an API that you can actually push, that will, as you're writing your code and testing your code, it will test your code on the fly. It's probably one of the most-effective ones for checking for accessibility of your code. It's not free, it allows, I think, a certain number of pages per month for free, and then you can subscribe to actually get more.
My hands are getting old and getting sore, and can't take that kind of pressure any more, so I've turned to dictation, probably about 2000, and I've tried various different pieces of software to use dictation, ViaVoice when it was first available, Dragon NaturallySpeaking in every single one of its versions, I think I've paid for every single version of Dragon since version two.
And the problem is, is speech-dictation software doesn't communicate well with the screen reader. As the dictation software is placing the text on the screen, the screen-reader software isn't seeing it appear on the screen, so I could dictate and the text was there, but I had no way of knowing what the text it was typing, unless I went back and reviewed it, with my keyboard. So a bit of a contradiction, that you dictate to remove, or at least lessen, the amount of keyboarding that you're doing, by having to go back and review it, and go through it word-by-word or character-by-character, I'm actually typing just as much as I would have if I'd typed it to begin with. Now the problem with that is, dictation software's not perfect, I don't know if you've, you've probably had problems using your phone, dictating something and the wrong words come out.
I live in a houseand it would say
I live in a mouse.And if you don't fix the word
mouseto
house,it would actually get worse and worse, and eventually one out of every five words would be recognized instead of four out of every five words being recognized. It degenerates very, very quickly.
When I used to teach people how to use dictation software, I would tell them it takes a long time, up to three to four weeks, to get a really good voice profile. It's better today, dictation software is much better than that. But it can take you as little as an hour to break a voice profile by not correcting it. I mean it just disintegrates very rapidly. It's really a very serious thing, so you can't be dictating unless you can recognize what the problems are and fix them immediately.
No, I didn't mean a mouse, I meant houseand it would go ahead and fix that, and help me at least maintain a good voice profile.
One of the other things that DictationBridge does is, both Dragon and Microsoft Speech, which we're working with both of those, have terrible interfaces. The screen reader can't actually grasp a lot of the information in those interfaces, for example, the Microsoft Speech dictation correction-window is a floating window that the screen reader couldn't actually grasp and figure out where it is on the screen, because it's not really there in a focusable place. And of course, if you try to focus in that window, the window goes away. So we've got some coding that will actually tell the screen reader where that window is, it will read the information in that window, and it will let us control the window and keep it on the screen. So it's not only giving us the echo back but it's giving us the ability to use the tools in an accessible way.
Wake up.
This is a demonstration.
wake up
wake up
So I'm still using the beta version of the software and we haven't actually fully-implemented all the features we want. I turned the microphone off, so we can talk again.
So the goal of our project was to make a product like this, that has all the features we want, but we're really a grassroots group that believe in free software, so what we want is our product to be fully free for everyone and anyone to use.
And typically, people who are blind and visually-impaired, don't have a lot of money. There's some massive statistics around the world that say things like 70% of the population who are blind and visually-impaired are unemployed. So buying expensive software, which screen readers typically are, they're about $1,100 for the leading screen reader on the market, and buying dictation software, like Dragon NaturallySpeaking, for another, say $200, that's a major investment for someone who might be on Social Security or might have some form of limited income, or no income. They can't be spending that much money to be able to be using a computer and in this day and age, you pretty much need a computer to get a job.
So the goal of our project team was to make sure that a person who is blind or visually-impaired, could have access to dictation with the free screen reader, and using something as basic as Microsoft Speech recognition. So you could go out and buy-
So we wanted people who are poor to be able to use a computer, and learn how to use a computer, and access it, and if they didn't have the use of their hands, then they should be able to dictate. And that's the goal of our project.
So, no it's not limited to that. It is limited that you need to have a fairly high-end, well not actually even in this day and age, you could use a netbook. You could use a fairly low-end netbook. Anything over a 1.9 GHz processor, and that's even a single-core, will work. You do need to have about 4 gigs of memory, maybe 8 is better, but I have used this on a fairly low-end netbook and it works for me.
We have totally blind developers doing some of the coding, our primary lead-developer is low-vision. There's eight of us on the team and we have three developers currently, no sorry, four developers working on the project, out of the eight of us.
And each person has a bit of responsibility for a different part of the code, so we wanted to make sure that we didn't only support the one screen reader, the free screen reader, we wanted to support all screen readers. So each developer has a screen reader that they've been assigned to work with and I'm not a coder myself, but it's really interesting, it has to do with some very low system hooks. Very few coders in the world actually have the skill-set to do system-hooking that will do things like capture the text from dictation, going to, basically they're having to intercept text between three applications and make sure that the right information is captured, doing things like re-mapping keystrokes to voice-commands is also very complex.
So we've got some very good programmers and they are all contributing to this project. We did crowdfunding and the crowdfunding is paying for their time, because we don't believe that blind people should be forced to make things for themselves and do it for free. We wanted to actually pay them for their work. So we used crowdfunding to pay for their work but the product itself will be free. I'm not the best person to talk to about the coding but maybe you could have another episode with one of our developers, once the project is done.
And this is fully, anybody can fork the code, anybody can re-contribute back. This is a full free software but it is also very open to anybody contributing, in fact we are always looking for people who would be interested in helping with it.
The primary code for NVDA, for example, is in Python, but we have some C++ components, we have some system-hooking and DLLs. I've had a couple of people look at the source code and they say it is actually very easy to get in and look at the source code and understand and comprehend it.
One of the things we find critical in the project is that we want anybody to be able to create a patch to this, improve it, and then contribute it back. It could be a person who's blind or vision-impaired, or be it a person who's just a coder who wants to contribute to bettering a product, so that everybody's experience is better.
It turns out that Dragon uses a special plug-in to send text to browsers and we had to figure out how to intercept that plug-in and capture the text from that, and that was a real road-block, it pushed us back quite a few weeks.
So we're not setting any dates for anyone right now. But definitely keep an eye on the DictationBridge website and who knows, maybe we'll do another funding program in the future, we want this thing to go.
Published on 2016-09-08