Interesting article here by Tyler Spaeth for Parthenon Software – with comparisons of softwares available from Google, Amazon, and CloudSight. The Robovision software is available here and can be tested out in your browser here to compare results. I ran a few various images through the online robovision and even got back some results where CloudSight specifically OCR’d text on the image – so that’s neat! For yet another sample Rekognition couldn’t ‘see’ the image at all while CloudSight gave a spot-on description of the image; for yet another the software specifically described the breed of dog (fawn pug) and what it was doing (eating slice pizza). Crazy!
(Image of pug eating pizza from WNYC.org’s ‘Dishing on Pizza‘ segment.)
Machines have been “seeing” for decades. Early uses of machine vision included a wide range of applications, from medical imaging to checking products for defects. More recent work has focused on improving image recognition: looking at a picture and (somehow) determining what it might be a picture of. Uses range from tagging and cataloging images for retrieval, to determining whether images violate Terms of Service.
For example, most social networks currently employ human “content moderators” to handle this challenging work. Facial recognition software algorithms are trained by running them on sets of images and checking the results. Law enforcement users of this software have discovered that a wide range of training images helps the algorithms recognize faces from a full range of ethnic groups.
To make it easy for you to experience the capabilities of commercially available image recognition, Parthenon’s Tyler Spaeth and Tim Tate have written an application that lets you compare outcomes from Google Vision, Amazon Rekognition and CloudSight.