On Tuesday, the 2024 edition of Google’s annual Google I/O conference kicked off with a very long keynote at the Shoreline Amphitheatre in Mountain View, California. Those in attendance got to geek out over a presentation from Google that was very heavy on its latest AI developments, with one of the most notable new features coming to Google Photos. Google is calling the new feature «Ask Photos,» and in practice, it’s a souped up Google Photos search feature that’s able to intuit context from your photo library.
The example used to illustrate the functionality of Ask Photos in the keynote was an instance where you need your license plate number, but can’t remember it off the top of your head. As it was explained, Ask Photos knows which cars appear most often in your photo library, and can intuit which car is yours to pluck your license plate number from them. Another example given was that you could ask it to show your child’s swimming progress and it would return results that discerns context of photos and videos of different types of swimming — in a pool, in an ocean, snorkling, etc. — and sort those by difficulty, while also pulling up certificates from swim classes and competitions.
How exactly does Ask Photos work?
The official Google Blog posted more details about Ask Photos after the feature was introduced during the Google I/O keynote. A third example question was used to illustrate this: «What themes have we had for Lena’s birthday parties?» According to Google, when presented with this kind of query, Ask Photos can not only narrow down photos of a specific person’s birthday parties, but also understand whatever underlying theme there was to the decorations, cake design, etc. for each year’s party. In the blog post’s illustration of what the results look like, Ask Photos was able to discern that for «Lena’s» last four birthday parties, the themes were «A princess celebration,» «Under the sea with mermaids,» and «Two magical unicorn parties.»
As the blog post puts it, «Ask Photos understands your query, and then forms a plan to find the answer,» during which it can discern people, places, dates, and keywords, as well as «natural language concepts like ‘themed birthday party.'» To respond, Google says that its Gemini AI has «multimodal capabilities» that are able to figure out what’s happening in each photo, including reading text in it, en route to selecting the right ones while generating its response.
Google does stress, though, that this is an experimental feature that won’t always offer up correct answers. If the AI makes a mistake, you can offer up corrections that will be taken into account going forward. Privacy-wise, Google says that it won’t use Google Photos data for ads. Similarly, it adds that humans will never see your Ask Photos queries «except in rare cases to address abuse or harm,» and that the AI is not trained on your personal data.