back to top
Tech

Google's New Camera "Clips" Uses AI To Automatically Get Great Shots

Designed for parents and pet owners, it's meant to help you capture candid moments.

Posted on

This is Google's new camera. It's called Google Clips.

Google

Its entire purpose is to automatically take candid photos of hard-to-capture subjects like kids and pets.

It's quite small, sort of cute, and is basically a cube with a big lens in the front. There is no display, or viewfinder, and it is meant to be used hands-free via an attached clip that doubles as a stand. It costs $249 and will work with iOS 10 and Android 7 or later. There's no ship date yet.

Wait, but what do you mean it automatically takes candid photos?

Yeah, so, here's where the camera gets weird.

The camera uses artificial intelligence to both evaluate picture quality and see if someone it "knows" is within view. If it decides that something is a good picture and it recognizes the subject (which could be a person or a pet), it takes a short clip — which can be saved as a video, a GIF, or as one of Google's newly announced Motion Photos. You can also select still images if moving pictures are not really your thing.

It saves a stream of these photos to its internal memory. Then, it connects wirelessly to your phone and a new app called Clips shows a feed of "suggested clips." You then have the option to save these, or delete them. (You can also set it to save all the suggested clips if you want.) You have the option to export photos to third-party apps, like email or Instagram.

Where the AI comes in

It is important to stress here that the camera isn't continually shooting and saving pictures, or taking them at set intervals. Rather, it is making value judgments about the shots it selects. It effectively acts as a personalized photo editor.

Google says it wanted to automate the process of both capturing and selecting great images. Which means it wanted to alleviate the tedious process of flipping through lots of shots to find a good one, or scrolling through video to find the perfect moment. So it evaluates those photos on the device as they happen to determine what to save to memory. What's more, it's taking more pictures than it shows you in suggested clips. You can can toggle a switch to see all the photos it takes. The suggested ones are the clips that the camera has judged to be delightful enough to rise to your attention.

Juston Payne, the product lead for Clips, told BuzzFeed News that the camera looks at many different elements in a clip to make those calls. It wants to see if the shot is stable and well lit. It looks for clips where people are smiling and have their eyes open. It has a bias for jumps and motion that indicate action. And most importantly, it has face detection that looks for a familiar face. (There are dog and cat classifiers too, Google says.)

Blaise Aguera y Arcas, a principal scientist with Google's machine intelligence, says that the camera is powered by neural nets that were trained by human curators. (In essence, people helped the camera's machine learning software understand what makes a good shot.) When it matches the attributes of a good shot with a subject it knows, it shows you that clip.

Aguera y Arcas predicts that, going forward, the Clips cameras will begin to learn what types of photos specific people love. "That's very much our hope, where we can develop modes based on people's tastes."

What's also compelling about this, from both a privacy and performance perspective, is that all this happens in the camera itself.

Traditionally, pulling off this kind of image selection and processing would have had to take place "on a bank of desktops somewhere with powerful GPUs", Aguera y Arcas told BuzzFeed News. "This is the first moment that it could plausibly be done on the device," he said. "It was a process of getting a chip specifically designed to run neural nets at very low power."

And because this happens in the camera, it means that it can get better battery performance than it would if it were processing in the cloud. It doesn't expend resources transferring data to and from a remote server to be processed.

(Google claims three hours; we found it to be better than two but not up to three on a prototype running beta software.)

Also, on-device AI means that if your camera automatically captures an embarrassing moment, you can kill it before it anyone else ever sees it. For example, the photo of my kid playing in the sprinkler was cute, true, but you could really see my back fat where I was bending over in the corner of the shot. Deleted.

Speaking of privacy!

There are several things Google did here to address privacy. For starters, it's offline. The photos are only stored on the device, unless you connect it to your phone and move them over (or set it to automatically do that). This means you have the chance to locally review everything it has shot. There's also a pulsing LED light that shows when it is active.

And finally, Clips purposefully looks familiar. Payne says Google wanted it to be instantly recognizable as a camera, and that "we were trying not to make it feel too much like a tech product." If someone else is wearing it clipped on their clothes, for example, you would immediately recognize that this thing is a camera and that it's maybe capturing your picture.

Advertisement

It's aimed at parents and pet ~companions~.

Mat Honan / BuzzFeed News

Basically, because the idea is to make the process of both taking and selecting photos easier, Google is targeting this device at people who take a lot of pictures of difficult-to-capture subjects. Because kids and pets tend to be mobile and unpredictable, and yet people still take lots of pictures of them, it seemed like the right audience.

The company put the camera in the hands of lots of parents and pet owners to study how they used it. It then built functions into the camera to be more pet- or kid-friendly (or -resistant, depending on your viewpoint). For example, it has been trained not to capture pictures when it detects being covered by a hand. "So when a three-year-old grabbed it and ran off," said Google's Eva Snee, who is in charge of UX and research for Clips, "it didn't capture anything because he had his hand over it."

We found the Clip to be particularly kid-friendly. It meant we could attach the camera to our shirt or jeans and get shots while still remaining in the moment. (Think: pushing on a swing set, or riding a bike.)

It comes in a little case, with a built-in clip and stand on the back.

Mat Honan / BuzzFeed News

That clip and stand is meant to help it take photos without your having to hold it. So you can place it on something and it will stand upright in place, or clip it to something like your clothes or your bike if you're on the go. The case is also how you swap it between portrait and landscape modes. Just turn the camera 90 degrees in its case to go from one to the other.

Here are some technical specifications.

The camera has a 12 megapixel sensor and shoots video at15 fps. It has 16 GB memory and a 130-degree field of view. There is no microphone, no display, no speaker. File transfer to your phone is via Wi-Fi and Bluetooth Low Energy. At 54 x 54 x 36mm, and 55 grams, it is quite small. (We temporarily lost ours in the couch.)

To train the camera on someone new, just take their picture.

Mat Honan / BuzzFeed News

Google says that the camera will begin to recognize familiar faces, human or not, and begin taking pictures of them. But if you want to speed that up and introduce someone new, just press the button in the front to take a picture and make sure they are centered in the shot.

It did a great job (mostly).

With the caveat that this is an early-release device, running beta software, Clips was mostly impressive. Especially if you think of it as a gee-whiz, rather than must-have, product. (In fact, Aguera y Arcas went so far as to say it was "very much a V1, or even experimental, product" and that he was "not expecting a best-seller".) Image quality was good. But in the era of high-end phone cameras, it's not going to blow you away.

While it is certainly capable of taking beautiful pictures, the magic is not in the image quality as much as its ability to easily get things that you simply previously could not. You can really see the AI at work when you swap between the raw stream of stuff it has captured, and the suggested clips. As Aguera y Arcas put it, "there are a broad set of moments that are just below the waterline."

That is, it takes a lot of the photos that may not quite rise to the level it sets for suggesting them to you. (You can still go in and look at them and select and save the ones you want.) I did end up grabbing a lot of these. But for the most part, they were junk. It was stuff that was ultimately a waste of time and space.

And that's what it is meant to do: It elevates the interesting so that you don't have to. In some ways, its mission is the same as Google Photos itself, which also tries to find and organize your best images for you. And it mostly pulls it off. Save for the occasional shot that reminds me I need to get to the gym.

CORRECTION

The Clips has 16 GB of storage. An earlier version of this story cited a different number.

Mat Honan is the San Francisco bureau chief for BuzzFeed News. Formerly a senior staff writer at Wired, he has been writing about the technology industry and its impact on society for nearly 20 years.

Contact Mat Honan at mat.honan@buzzfeed.com.

Got a confidential tip? Submit it here.