Why brand representation and interpretability matter in the era of LLMs

Nik Ranger

Nik Ranger shares why brand representation and interpretability matter in the era of LLMs.

« Back to Additional insights

More Additional Insights

Nik says: “My additional insight is why brand representation and interpretability matter in the era of large language models.

That is what we're wanting to think about, and start thinking beyond search engines. So, start thinking about how your brand is represented inside a large language model – like ChatGPT, like Gemini – and how you can understand the decisions those models make.”

OK, I want to dive into one of those words that you mentioned there. Was it interpretability that you were saying?

“Yes, so model interpretability is about making machine learning models understandable to humans: being able to explain why a model made a certain decision or prediction.

It's like asking the model, ‘Hey, what made you say that?’ It's crucial when you're wanting to build trust in the model's outputs, understand if your model is biased or broken, make key decisions based on the model's recommendations, improve the model, or be able to debug that. This is really the basis for AI Rank.

For those of you reading, hearing, or watching right now, you may have seen Dan Petrovich. He has built AI Rank, and this is a very useful tool that uses model interpretability to essentially be able to look at the brand-to-entity/entity-to-brand representation, to be able to see how your brand is being perceived and what it understands about you.

The whole concept behind this is so that we can essentially reverse engineer this, to be able to do experiments with this, look at the way that words are being sequenced, and look at the likelihood of association with certain keywords. Essentially, we're wanting to optimize this for the LLM bots, and that's really the core thing that I think is really, really fascinating and interesting in the way that we're looking at this in 2025.

It is a very different shift, I think, in the way that we've traditionally looked at SEO, but I see this as important for everybody. This is because LLMs (large language models) are rapidly changing the search landscape. They are powering conversational search. They are powering AI assistance. They influence traditional search results through features like AI overviews, and I think that, as these things become a lot more prevalent, how you are perceived and represented within these models can directly impact visibility and, hopefully, the end user to make decisions and to be able to perceive you in a light that is most favourable to you.

Particularly when it comes to really competitive industries, like when you're very well known, really wanting to make sure that you're being represented correctly by these LLMs is a burgeoning, opening new thing that we can really think about, moving into the future of SEO.”

What are the key things that are different about how we optimize for LLMs versus how we optimize for more conventional search engines like Google?

“Well, I think the thing to keep in mind is that how the content is on the page, and how it is there and is visible, is really through the association of how you're seen. To be able to answer your question, I'm going to just take you through a little bit of how we are going to do this assessment, and I think a lot of people – when I talk about how we're doing this and what specific things that we do to be able to take this information and do something practical with this – will really see the differences.

With AI Rank (and this is free, by the way. This is free for anyone to go in and use. We're not paywalling this or anything like that), it's looking predominantly at ChatGPT. If you're a client with us, we also include this with Gemini. That's a different matter. But, we analyse how the brand is currently being represented within ChatGPT, for instance. So, we use targeted prompts to be able to understand the entities and brands that are associated.

We start with, ‘How is your brand associated with this particular entity?’, whether it's a product, a service, or things like that, and we sort of flip the script. Then we say, ‘What brands do you see as being related to that?’

Now you will have two columns of different prompt outputs, and then it will tell you things about what it understands: whether you're associated with your industry, the products, and the services. You can identify any inaccuracies, any potential negative associations, or even just gaps in the model's understanding.

That's where we start to get a base look. Then we do bidirectional programming. So, bidirectional means prompting the LLMs to gain a comprehensive view of brand association. So, to say, ‘List brands associated with the industry or the product’, and list your brand with the products. That reveals how the model perceives your brand within your competitive landscape.

One of the things that Dan talked about is ‘custom cycling jerseys’. This will spit out a load of different brands, and we really want our brand to be part of that association. If it's not, then that's giving us some ideas as to what the LLM has within its pre-training data to say what it is associated with. When we have all this, then we're like, ‘Okay, we may have some content and maybe some semantic understanding of this, and we wanted to maybe do some optimization with this.’

Looking at the site, the content on the site, and even the structured data that is marking up and doing all these additional associations, we really want to align this with the desired brand representation. That leverages the insights gained from what we can see from this bidirectional probing: brand-to-entity/entity-to-brand, right? We want to have better associations.

A lot of the time, it has predictive measuring. When we're looking at this, we will have a sentence and we will say, ‘Okay, based on what this sentence says, what is the most likely word that will proceed this?’ When we’re testing this, we can see, if we change any one of these words, it will spit out new and different word associations. Within our tool, we test out different sentence structures, and this is very, very granular.

Even changing things like ‘with’, ‘to’, or any one of these other modifiers within this, the model is giving us a better likelihood of association within just that sentence. When we're wanting to look at the way that things are written – and we're wanting to look at the title tag, the heading tag, the description, even in the above-the-fold key bits of text that are found on the page – we're wanting to really refine that content to be better aligned with sentences that have a better statistical likelihood of showing the words that we want the LLM to pick up, essentially, and be able to associate with our brand.

Now, this is very granular in the way that I've explained this, but when we're looking at this in its entirety, we can model this and be able to identify very quickly whether there are content gaps, whether there are good associations or bad associations with relevant entities, and have more of a comprehensive look at how the brand is, hopefully, readily available for these large language models to be able to learn from – and then reverse engineer this to be able to trigger that brand as the answer for this.

We're really, really hoping to influence this with on-page content (and maybe even off-page content with the way that we're doing our PR, we're doing our outreach, etc.), and we're really wanting to have some influence to see whether this is changing the way that it's associating this within the context for an LLM.

That's why I'm saying we really want to influence the way that we're writing to have a better association with this. From explaining this long-winded thing, what I'm trying to say here is that I don't think this is anything like what we have experienced in traditional SEO, because now it's looking so particularly at the sequencing of words and every single word, when we are breaking this down.

We have Google BERT, for instance. We have bidirectional encoder representation of transformers. We are looking at the way that they look, in its entirety. So, this is incredibly different.

If you think about it – you think about passage indexing, you think about featured snippets that get pulled out and are being represented – even within that bit of text, we're being so hyper-specific about the way that things are written because we really want to have the sequence of words that have the better statistical likelihood of having better associations and better representations to be able to cue words within its model, for when it's actually going to pull that in.

To answer the question of how it's different, well, that's incredibly different to the way that we've traditionally done things.”

There are so many follow-up questions that I could ask in relation to that. Let's try and stick to two. In a second, I'll probably come to what we do about improving our content because, obviously, you're saying that any words can significantly impact the way that an LLM perceives what that content is about and whether or not it's actually wanting to feature that piece of content in its own results. So, how do we improve our content? Do we simply ask LLMs for suggestions on how to improve our content?

But firstly, in relation to the LLM model – because you focused on ChatGPT and Gemini to begin with – can we use those as proxies for other LLMs, i.e. other LLMs that maybe operate slightly differently? Is it likely that they're going to be perceiving our brands in the same way? Do we just have to optimize for one LLM?

“Well, without doing testing, I cannot honestly say. From the way that I've seen this, we're wanting to just test the ones that are the most popular and in the cultural zeitgeist at the moment – that are being used. Obviously, we'll pay attention closely to Gemini, as that is what they're using to impact Google Search.

I think, to help explain this, I need to just step back and explain the difference between ‘grounded’ and ‘ungrounded’ within the context of a large language model, and to refer to whether the model's external responses are based on external factual information or whether it's on its internal knowledge.

Just to explain this, ‘ungrounded’ solely relies on the LLM's training data, which is the amount of data that ChatGPT or Gemini have used to be able to build their own world knowledge within that, right? Now, this in itself can be prone to hallucinations: getting incorrect or nonsensical information that sounds plausible but isn't factually accurate. It's useful for creating tasks and brainstorming, and it’s really, really good for emulating or doing a mimicry of someone in the way that they may speak or in the way that they might construct sentences and things like that. So, it can be very useful in that way.

When talking about ‘grounded’ as a difference to this, this is what is most interesting, and what we have spent quite a lot of literal time and money on, because we've used Google API to be able to have a good list of all these grounding queries we've built. I think this is the cool point of difference with the way that we've worked on this (I say we. I mean Dan. Dan Petrovich is a real genius in all of this): wanting to be able to have a prediction as to whether a query will be grounded or ungrounded.

Why I say grounded is so fascinating is because ‘grounded’ means it is connected to and retrieves information from live search results. So, it will have the pre-trained data, and then it will have the live search results that it will pull in from Google's index, and it will use RAC (Retrieval Augmented Generation) to be able to pull these external sources and have more factually accurate and up-to-date responses.

This is incredibly useful for acquiring real-world information, answering questions, and summarising news articles, for instance, and even providing product recommendations, where it needs to check whether they're available or whether they have certain key things about them. Grounded results are also less prone to hallucinations, although the accuracy can still depend on the quality of those external resources.

This is why, when people talk about AI-generated results, good SEO is still applied. It is still applicable, in that sense, for the grounded result – because it will use Google's index as a way to be able to pull this in, amalgamate that with its pre-trained data, and give something a little bit more helpful within that prompt and within that result.

With that in mind, this is very, very different in the way that we're wanting to do this because those differences are, I think, really crucial for us for several reasons.

We can influence the responses by doing tests. So, if we're trying to optimize the content for the grounded LLM results (which are increasingly influencing search results), we need to really focus on ensuring that the content that we're putting out there and having indexed is not just accurate, but also has really good information and is readily available. It's looking at data accessibility and making sure that that is there, optimizing web content, structuring that data, or even building knowledge graphs and really making sure that that is well optimized.

Those are the traditional SEO things that we all know and understand but, with optimizing for the LLM, this is where we can really have that new paradigm shift. We can tweak the on-page content. We can test this. And, with the measuring of this over time, we can see whether or not this is getting picked up and associated in the way that we are wanting to tool and have influence or not.

I think that this is a very different way of approaching this and working with this. As we're starting to roll out with clients and doing some initial tests, our aim of the game is first to explain this, but also be very hyper-specific with, when we're wanting to pull this in, when we're wanting to look at key real estate that is being looked at for search interpretability, we're wanting to tweak and do experiments with reverse engineering the prompt in the way that it will be understood. Does that help?”

Absolutely.

In terms of a takeaway, how do we go about improving our content if we find out that we're not appearing, and the entity isn't as optimized as we would wish in order to appear in various search results?

If, using the tool that you suggested, we determine why that might be, what is the process that you recommend to enhance the content so it's more likely to appear in future results?

“I've got a great example of this that we're quite literally working on with a client of ours right now.

The custom cycling jersey is their primary product that they really wanted to focus on. For us, it was fascinating to see the way that we were getting the brand-to-entity/entity-to-brand results associated with their main entity and their brand. It was picking up that OWAYO (which is the client) were more associated with ‘soccer’ and with ‘Germany’, because they are a German country, and I thought this was fascinating.

As a technical SEO, I look at this, look at their site, and I can see in the way that the site has been set up with their HTML, that ‘soccer’ is at the top as one of the first bits of internal linking across their template of pages. That's literally the first block of information that they can go through. Then, secondarily, there was ‘custom cycling’, but we wanted that to be first. That was interesting.

The second thing that I thought was really interesting was that it was associating that they are a German company. Now, when it's going through and it's crawling this (and we can verify this with log file data and the way that we're seeing the bots go through and crawl this), they're picking up a lot of their product category pages – their PLP pages – and they're finding that, on these main pages, the first thing that they see as part of the on-page content is, ‘We're a German company.’

They've seen the menu, the H1, and some links off to the product pages, but the first big piece of content that they're seeing is this grid, and the first thing that it says there is, ‘Hey, we're a German company.’ Lo and behold (and we've run this test multiple times over the course of many, many months), there's still a strong association with the German company. It’s not a bad association – it’s truthful, it’s accurate – but is it the one that we’re wanting to have as the predominant thing when you’re wanting to say who OWAYO is and what they do?

It's saying that OWAYO is a custom soccer jersey company based in Germany and, while that’s true, it's not the primary goal of what the client wants to be known for.

So, let's do some experiments. Let's maybe look at the on-page template. Let's maybe tweak, with testing, whether we can have ‘custom cycling’ further up in the HTML, as more of the predominant feature within this, and let's just test what this looks like. Also, on the PLP pages, maybe we might shift the fact that they're a German company, and the image and little blurb about them, and put that lower down into the fold, experiment with what that looks like, and be able to tool that.

So, to answer your question, the ‘how’ is really so minute in the number of experiments and hypotheses that we are testing. But, how we measure that success is we are tracking the frequency of how many times a brand appears relevant to the targeted prompts within their LLM – and we've developed this tool. Again, it's super free. It's AI Rank. You can use this. Go find Dan's posts, click away, and set them up. I think you get around 10 entity associations with your brand, just off the bat. You can test this over time and see what this looks like.

Do experiments of your own. Test the placement or the sequencing of the words and the way that they are written. Measure the consistency, measure the prominence, and look at the associations that it's pulling and see whether this is impactful and effective.

I know that Lily Ray is doing experiments and testing this within her own very well-known brand at the moment, and she's getting some really fascinating results. Go check out Lily Ray and her tweets, and what she's doing over there as well.

Again, a lot of this is really, really interesting in the way that we're doing these experiments and wanting to pull them in, and being very, very hyper-specific on what we're wanting to test and wanting to track the output of this, and how this is hopefully having more of an LLM-driven product discovery and recommendation. That's really what we're trying to get out, at the end of the day.”

If an SEO is struggling for time, what should they stop doing right now so they can spend more time doing what you suggest in 2025?

“I think SEOs should really stop manually analysing massive amounts of keyword data that is solely focussed on traditional rank tracking.

I've been saying this for a very long time, and I think it's even more prevalent now in 2025, with the rise of LLMs and semantic search, and it really does require that shift to understand the broader context in which a brand exists and how these AI agents are perceiving and representing that brand.

I think this requires a lot more tooling and leveraging these insights to make some new frameworks, new testing, and new experiments to be able to look at this – rather than relying on keyboard-centric approaches. I think that's been true for many, many, many years now.”

Nik Ranger is the Senior Technical SEO Consultant at Dejan Marketing, and you can find her over at Dejan.ai.

Choose Your Own Learning Style

Video

If you like to get up-close with your favourite SEO experts, these one-to-one interviews might just be for you.

Watch all of our episodes, FREE, on our dedicated SEO in 2025: Additional Insights playlist.

Podcast

Maybe you are more of a listener than a watcher, or prefer to learn while you commute.

SEO in 2025: Additional Insights is available now via all the usual podcast platforms

Spotify Apple Podcasts Audible

SEO in 2025

Catch up on SEO tips from 101 SEO experts in the original SEO in 2025 series

Available as a video series, podcast, and a book.

SEO in 2025

Majestic

Why brand representation and interpretability matter in the era of LLMs

Nik Ranger

Choose Your Own Learning Style

Video

Podcast

SEO in 2025

Fresh Index

Historic Index

SOCIAL

COMPANY

TOOLS

MAJESTIC FOR

PODCASTS & PUBLICATIONS