AWS Deepens Object Recognition Technology

Expected to start shipping in mid-June, AWS DeepLens gives developers the chance to expand their machine learning skills.

By now, I think that it's safe to say that most people are probably familiar with the concept of facial recognition software. It's one of those technologies that has existed for enterprise use for quite some time, but that's also taking over consumer devices. My Microsoft Surface Book 2, for example, can use facial recognition in place of a password, thanks to the Windows Hello feature. I've also seen prosumer-grade cameras that automatically focus the lens on people that the camera recognizes. Given the popularity of facial recognition software, it isn't exactly surprising that Amazon Web Services has gotten into the facial recognition game. However, AWS hasn't stopped with basic facial recognition. It has taken recognition to the next level.

The AWS solution for visual recognition is called AWS DeepLens. Before I explain what sorts of things AWS DeepLens is capable of, there are a couple of important things that you need to know about the service.

First, unlike AWS services such as Amazon Simple Storage Service (Amazon S3) or Amazon Elastic Compute Cloud (Amazon EC2), AWS DeepLens doesn't operate solely in the cloud. This makes sense when you stop and think about it, because object recognition usually involves the use of a camera. In the case of AWS DeepLens, however, a basic webcam won't do. Using AWS DeepLens requires a special DeepLens camera. This camera has the following hardware specifications:

  • 4 megapixel resolution
  • 8GB of RAM
  • 16GB of internal storage
  • A 32GB SD card
  • Wi-Fi connectivity
  • A micro HDMI port
  • An audio port
  • A USB port
  • An Intel Atom processor

The other thing that you should know about AWS DeepLens is that although the company intends for developers to use the AWS DeepLens API to build business applications around DeepLens-enabled applications, Amazon has made it possible for non-developers to experiment with DeepLens technology. Anyone can get started with AWS DeepLens by using the various pre-trained models, and from there you can build your own recognition models and applications.

So with that said, let's talk about what you can actually do with AWS DeepLens. The technology itself is super flexible, so you could conceivably use AWS DeepLens to build applications that are designed to recognize just about anything. However, there are several capabilities that AWS has already exposed through various recognition models.

Facial Recognition
The most obvious use for AWS DeepLens is probably facial recognition, and AWS provides a pre-built facial recognition model. In my opinion, facial recognition technology has nearly endless potential. So far, large-scale facial recognition technology has primarily been used for law enforcement and similar purposes. Police use facial recognition (and license plate recognition) to find criminals. Similarly, some casinos are known to use facial recognition technology to alert security to the presence of known cheats. However, there are also more benign uses for facial recognition technology. If you travel a lot, for example, you might use facial recognition and AWS DeepLens to alert you in real time to the presence of your house sitter or to the presence of someone who is not supposed to be in your house.

Object Recognition
AWS also provides an object recognition project that demonstrates that AWS DeepLens isn't limited solely to recognizing faces. The Object Recognition project can recognize 20 different types of objects ranging from people to plants to vehicles. The native object replication project is more of a proof-of-concept project rather than something that you can immediately put to good use. Even so, this project clearly demonstrates the ability of AWS DeepLens to recognize common objects.

Action Recognition
The action recognition project caught my attention because action recognition is one of the capabilities that I think holds the most potential. The demo project can recognize 30 really diverse actions, ranging from riding a bicycle to blow-drying your hair.

In the future, I think that action recognition could be used for quality control in factories. For example, a deep learning application could learn the steps required to correctly assemble a product, and could detect deviations from the correct assembly method. Similarly, fast food restaurants might use action recognition to monitor food preparation and health code compliance.

Of course action recognition could also be used for crime prevention. A casino, for instance, might use action recognition to ensure that dealers are adhering to the proper protocol when dealing card games. Likewise, retailers might build applications that recognize common shoplifting techniques.

I'll be the first to admit that action recognition isn't totally new. Some of the consumer-grade drones perform gesture recognition in an effort to recognize hand signals given by the pilot. However, the supported gestures tend to be rather simplistic and must be given in a very distinct way. What AWS is doing with its action recognition is to make it possible for a machine to recognize various actions without the actions having to be performed in a super deliberate way.

About the Author

Brien Posey is a 22-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.


Subscribe on YouTube