The Microsoft Seeing AI app aims to be a game changer. Look, I get it. Saying the words “game change” or using the phrase “you’ve gotta try this Microsoft product” is, for me, the equivalent of a chocolate addict saying something like “you’ve gotta try this avocado pudding.” Doesn’t seem to compute, right? And while I have actually done both in the last week, both to my surprise, I want to focus this post on the Microsoft side of the equation. The avocado pudding will likely show up in a future post, so keep your eyes peeled for that.
This week, Microsoft launched what it calls a research project, but what many in the blind and visually impaired community might very well call a game changer. For a company that, to many people, seems to be at least two steps behind when it comes to accessibility and innovation, the Microsoft Seeing AI app comes from far out of left field, but in my very humble opinion, this is a swing for the fences that results in at least a triple.
I’ve written at length about how Apple products and services have made my travels as a visually impaired filmmaker possible. How the on-board screen reading software, VoiceOver, keeps my work on a laptop humming. How the iOS version of same makes the device about as accessible as it can possibly be for the blind and vision impaired. How the Apple approach to accessibility has always seemed to be built into the recipe – that the product design and engineering teams seem to start with the goal of making a product or piece of software accessible from the ground up, rather than attempting to tack it on at the end (with the notable exception of creativity and design apps like Final Cut Pro and Motion, so listen up, apple).
Microsoft, on the other hand, has seemed content to let third party developers do most of the heavy lifting. Blind and vision impaired users in a PC world know that their systems are not usable without the aid of software like JAWS and ZoomText, screen reading and magnification software that is built into the operating system in a Mac world in the form of VoiceOver and its own magnification algorithms. As someone who trains other visually impaired computer users to learn how to maximize their efficiency and proficiency so they can increase their chances of getting and keeping a job, this has always rankled me.
That’s why I’ve been so surprised this week to see such a game changing development come out of Redmond. I should add that I don’t use the term “game changer” lightly. Like the words amazing, empowerment and gluten free,” it’s a phrase that is often overused and rarely necessary. However, if the Microsoft Seeing AI app is not a game changer, it is at the very least a major new player, and if it signifies Microsoft’s intention to lay a claim to earning the loyalty and the business of the visually impaired market, I’m ready to listen.
What the Microsoft Seeing AI app is all about
The Seeing AI app is designed to bring the world of text, products and facial recognition to you and your iPhone as quickly as possible. Yes, this is an iPhone app. As of this writing, it is not yet available for Android.But this is mostly a solid business decision by Microsoft because they must realize that Apple is so far ahead of the game when it comes to loyalty among the visually impaired mobile market that you’ve just got to swim in the pool where everyone has already put on their trunks.
The app is divided into five major sections and is completely compatible with VoiceOver touches, taps and swipe gestures. You switch from function to function with the single finger swipe up or down, while left or right swipes take you to the task bar actions within each function. That’s really about it. Most people will very likely begin using the Microsoft Seeing AI app functions without ever referring to the instructions, especially if you’ve used apps like the KNFB reader beforehand.
And make no mistake, this app intends to be a KNFB killer. By far, the biggest and loudest complaint about the remarkable KNFB text recognition app is that it costs $100 USD. that’s a hefty chunk of change for a lot of people. The Microsoft Seeing AI app is free. Yes, it’s free. That’s going to be enough to end the debate right there for many people, but let’s talk about what the Microsoft Seeing AI app does well, and what it does not do as well yet.
Where the Microsoft Seeing AI app is a game changer
The Microsoft Seeing AI app shines very brightly right after installation with the first of the five functions it offers. That first function is called “Short Text” and if you’re used to long load times for text recognition apps, this is, quite simply, going to blow your mind. Point the camera at a piece of text and it immediately begins reading that text to you. And I do mean immediately… like less than half a second from pointing the camera at text to hearing what that text is. You don’t need to snap a picture, send it to the cloud and wait for a server to do its job. The app seems to work without needing to upload information to the cloud, but rather uses the iPhone’s onboard processor.
I have been using the “Short Text” function for the past three days on everything I could find, and I can tell you that being able to use my phone to instantaneously read envelopes, supermarket items, street signs and titles of books on shelves without any load times or fussing around with camera buttons is truly remarkable. This product works as advertised. While walking around my neighborhood yesterday, I stopped for the first time in years at a neighborhood bookstore that has a shelf of books outside with featured titles and used selections, and the ability to hold a book in one hand, point the phone at the title or description in the other hand and get instant feedback just by pointing the camera at the book… well, if you’ve never seen a grown man cry… actually, you still haven’t because I held the emotions in check, but it was close.
Yes, you can do something similar with the KNFB app, but KNFB readers know that it is a far bulkier process. You have to take the picture, wait for the transcription, navigate back to the camera page/app home screen, take another picture, listen to the new transcription and do this again and again until you’re done. The Microsoft Seeing AI app is as close to your eyes as technology has gotten yet. It auto refreshes, in real time, each time there is something new to read. It is truly remarkable. For this function alone, you should download this app.
What else is in the Microsoft Seeing AI toolbox?
The Microsoft Seeing AI app also has what we might consider to be a more conventional OCR function, called “Document,” and this will be familiar to users of the KNFB app. It’s designed for longer text documents like book pages, bills, bank statements and menus (although I’ve found the “Short Text” function quite useful for menus as well.
It’s not quite there yet. For larger and longer documents, the KNFB app is still the app to beat. It takes the Microsoft Seeing aI app a long time to frame a document, take a picture of it, upload it to the cloud for processing (yes, for this function, the cloud is definitely involved) and begin reading it to you. While the OCR itself is very good, the lag time is the issue. The KNFB app does a far better job at quick turnaround time for something like a printed page of information. Both apps can use VoiceOver gestures to navigate through document text. On this one, though, KNFB has the edge.
Faces and Products with the Microsoft Seeing AI app
The Microsoft Seeing AI app has a function setting for product recognition and another one for facial recognition. The product recognition function turns your camera into a bar code reader and while it also needs access to the cloud to tell you what product is being scanned, I’ve found it to be accurate and useful. What I’m hoping is that, given the amount of storage space in late model iPhones, the Microsoft Seeing AI app will eventually include an option to download the UPC database, which is usually only about somewhere between 4 and 8 gigabytes. Having this information on the phone itself would be incredibly useful in supermarkets, where the back of the store is often impervious to cellular signal reception.
The facial recognition function is very good in a general sense. Once you program names to associate with the pictures of your friends, it does a good job in various lighting situations. Where it is unintentionally hilarious, though, is in describing the faces of people it doesn’t know. For some reason, the Microsoft Seeing AI app developers thought it would be a neat, “oh wow!” trick for the app to guess the age of the person being described. And I can tell you that when it guesses wrong on the young side… – say, describing a woman in her forties as being 31 years old – it can be amusing. It is not as amusing, though, when it works in the other direction, or when it describes a woman with blonde hair as having gray hair. Listen up, Microsoft Seeing AI app developers, work on this or ditch the age function because you’re gonna get a lot of people ticked off.
The Microsoft Seeing AI app tries to tell it like it is
The Microsoft Seeing AI app also has a fifth function, called “Scene” that it calls experimental, and rightfully so. It’s designed so that when you take a picture of what’s around you, it attempts to describe your environment. So for example, if you’re sitting in a coffee shop and take a picture of your surroundings, the app says something like “three people sitting at a counter drinking beverages.” I tried this by walking into a Starbucks and snapping a picture as soon as I walked in, and the result was “a crowded restaurant with what appears to be a line on the left.” You know what? that’s pretty great. As many blind and visually impaired people know, finding where the line is can be one of the most difficult parts of our navigational day. To get this one right even some of the time is huge. This function is described, and rightfully so, as a work in progress, and the instructions make a point of telling you that under no circumstances should you rely on it as a navigational aid. I agree. It’s not going to tell you if a car is coming or if you’re about to walk into a telephone pole. Keep the cane and your orientation and mobility skills. As always, they’re the real tools that get you going and functional.
But all in all, theMicrosoft Seeing AI app is an app you should be including in your toolset. The “Short Text” function alone is worth the download. Personally, I’m keeping the KNFB app for the heavy lifting of serious document processing, and it was still, for me, $100 well spent. But I can see a lot of blind and visually impaired people downloading the Microsoft Seeing AI app and using it as their primaryOCR tool, and I don’t blame them. After years of taking a back seat to Apple, Apple may have good cause to be looking in their rear view mirror.
Want to try it? Here’s the link to the Microsoft Seeing AI app on the Apple App Store.
Have you used the Microsoft Seeing AI app? What are your thoughts?