The author’s views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

In this week’s episode, MozCon 2022 speaker Crystal Carter talks you through the different optimizations that you can make for visual search, and the kinds of results that you might see for visual search content.

whiteboard outlining the process for visual search optimization

Click on the whiteboard image above to open a high resolution version in a new tab!

Video Transcription

Howdy, Moz fans, and welcome to my Whiteboard Friday on visual search. Today, I’m going to talk about the different optimizations that you can make for visual search and the different kinds of results that you might see for visual search content.

Visual search optimization

So what happens with visual search is that you would do some optimizations on your website. Then, the user would do a visual search, and then they might get a different kind of result.

Image SEO

So the kinds of optimizations that you should consider for visual search, which is searches that are made via Google Lens or Pinterest Lens or via Bing’s image search tools, include image SEO around making sure that you’ve got images that are performing well for image SEO with good file formats, titles, alt text, alt tags, schema, all of that sort of thing. 


Also, you’re going to think about the kinds of entities which are within your photos. So visual search recognition software and tools, they can understand lots of different kinds of entities. There are a few that they prioritize in particular, though, and they include logos, landmark, text, and entities, which I’ve called “things” in this particular instance just as a shorthand, but entities that are essentially things that are found within the knowledge graph.


And then, the other one you want to think about is your composition. So the composition that you have for your image will affect what Google understands the image to be about. 

So, for instance, the way that different elements are positioned within an image can affect how Google understands the image. So I did an article for Moz at the beginning of the year, where I compared a teapot, and there was a teapot where the handle was here and the spout was here, and they understood that to be a teapot. 

How composition impacts Vision AI interpretation of images.

And then, when they turned it this way, they understood it to be a kettle, and those are two different things. So the way that you think about composition for your image can affect it.

So make sure that you have clean and clear images and also that you’re thinking about your images being similar to user-generated content, particularly if you’re in a B2C business, and also that you understand the primary focus. So, for instance, if you had a photo of a bicycle and you were trying to emphasize the bicycle part of the image, if you had somebody who was sitting on the bicycle or standing next to the bicycle and they were taking up most of the image, Google would think that that picture was more about that person than it was about the bicycle. So think about where the primary focus is in your image in order to optimize for visual search.

You also want to think about contrast, just making sure that it’s very clear what the focus of the image is and so that you’ve got whatever is the focus of your image very clear and easy to decipher and not too busy if you need it to be about a single thing.

So these different elements are things that you should consider when you’re optimizing your images for a visual search, particularly for Google Lens, and as users carry out a visual search. 

Visual search results

So, for instance, if you use Google Lens and you take a picture of a butterfly or a caterpillar or a flower or a chocolate donut, you’re going to get lots of different types of results. 

Image pack

So, first of all, you may very well get an image pack result, and this will include some of the information that we were talking about before. 

So the difference between visual search and image search SEO is that in an image search SEO, like when you go to the Image tab within Google, you can enter the word “chocolate donut.” But let’s say you didn’t know what a chocolate donut was, or let’s say it was a different language and you didn’t know the local word for chocolate donut. So what would happen is that the user would make the search of the chocolate donut, and Google would use its tools, like Vision AI, for instance, to understand that that’s a chocolate donut, and then they would look through their images to understand which ones had text cues that were talking about chocolate donuts and that sort of thing. So that would return, potentially, some image pack information, and also, in the chocolate donut example in particular, it might return something like multisearch. 

So, for instance, you would do a modification. You might say a donut like this, but with sprinkles maybe, for instance. You might also get a result that’s around Google Shopping, for instance.

SERP features

The other one you want to think about is the kinds of result you might get for a different SERP feature. So Local Pack is something that might come up. Also, knowledge panel. So particularly with the entities, the entities may very well be attached to a specific knowledge panel. So, for instance, logos for businesses or landmarks will have a knowledge panel, and also certain things, like if you were to think about something like Lego, that may very well have a knowledge panel as well. And landmarks, again, also could very well be showing in Google Maps. 

So think about the kinds of SERP features that you might show there. And that means that you could also, while you’re optimizing this as part of your optimization for visual search, you might think about the optimizations that you make for these types of SERP results as well.

Visual match

Finally, the other kinds of results that Google might give to someone when they make a visual search is a visual match. So visual matches are images that look really similar to the picture that the person took, and these will sometimes return image packs and sometimes return a Local Pack, and they’ll sometimes just return a general SERP result, like including a featured snippet that might have an image in it. You might also see something for a Google Business Profile. So if there is something that’s local that has that, then they may very well get a Google Business Profile visual match, and also just general web content that might come through there.

So there’s lots of different opportunities to return a visual match, but this one is particularly good when you’re thinking about the composition of your images. So if you have a lot of footfall, if you have a lot of interaction with customers where they are reviewing your content, where they are visiting your establishment, and they’re creating a lot of user-generated content, then think about how you can create images and add images to your website that satisfy the visual match queries that users might be making.

And I think there are some great opportunities across visual search in the next few years. Google has been investing in this quite a lot, and I think that this is an opportunity for businesses of all sizes, and I hope to see more people getting involved with visual search optimizations.

Video transcription by

Source link