Build your own LLM tools to free you from boring work
Gus says: “Use LLMs to build little SEO tools that help you with analysis and boring work so you can spend your time looking at outcomes.”
What simple SEO tools are you talking about building?
“I have two examples. The first is that I needed to find a way to map related pages at scale. Normally, you can go page by page, open an article, look at which other articles you have on that topic, and start mapping them together. However, the larger the website, the more pages you have. Suddenly you have a very manual task that you have to run through.
Just playing with ChatGPT (and I assume similar tools will give you similar things), I asked for a Python script to do that calculation and find similar pages for me – and I got a tool that works.
As a second example, I wanted to find out the month that a page had the most and least traffic. Instead of trying to find out what happened on each page for each month and manually going through Google Analytics, I just uploaded all the data into a sheet and the tool gave me an export of the findings that said which page it was, and which were the best and worst months.
Then, you can go into more detail about why it happened, and what happened in that month. You can compare keyword positions or look into search volume to see if there’s a decrease. However, some of that work of identifying when something happened has already been done for you. You’re ahead of the game and focusing on the outcomes and why this happened, not just collecting data to start the analysis.”
Why should SEOs build these sorts of tools by themselves?
“There are a lot of tools out there, but you might need things that are more specific. It also makes your life a little bit easier.
To help explain how it can work, I’ll walk you through the whole process I used to find those related articles more easily in more detail. I started in Screaming Frog, which has a connection with ChatGPT. You need to get an API key and buy the credits, but a few pounds worth will be more than enough for thousands of pages.
First, in Screaming Frog, I did a crawl that also extracts embeddings (which are just transforming the words into numbers). It will give you an export that shows each page and all the embeddings that exist on those pages. The export will be full of very long numbers, which you can transform later.
Then, I went to ChatGPT and asked for a Python script that would look at embeddings and calculate the closest similarity between them, then give me 5 pages that are related to each other.
With the vector embeddings, you’re going to get those numbers, and a cosine similarity calculation will measure how close these numbers are to each other. The output you get from the tool shows which pages are most closely related to each other based on all the words that exist on the page. I’m not just looking at titles or descriptions, I’m looking at the universe of words that exist on page A and the universe of words that exist on pages B, C, and D.
In this case, I asked it to give me the 5 pages that are most closely related to the initial page, excluding the page itself from this calculation. I’m not a coder, so I asked ChatGPT to make a Python script that would consider cosine similarity and work on Google Colab, which is a platform that allows you to write and execute Python code.
I went back to ChatGPT and said, ‘Give me a Python script that works on Google Colab and I will give you a list of pages. Column 1 is called ‘Page’, and Column 2 is called ‘Embeddings’. Give me the 5 pages that are closest to my initial page.’ Once I uploaded this sheet onto Google Colab, I didn’t need to do much configuration. You don’t need to understand how to code. You’re literally copying and pasting from one place to another, and it will open up a section where you can upload your sheet.
Once you upload the sheet, there’s a calculation and you can ultimately download an output sheet that shows the source page and the 5 pages that are most closely related to that based on the list that I uploaded.
There are a lot of concepts here and I’m still learning a lot of them, but I know it works. I’ve done that to find pages that are related to each other, and I’ve done that to find the peak time in terms of traffic for a page.
Say you’re looking at 5,000 pages that all lost traffic, and you want to know when they started to lose traffic – when they were at their best versus now. You can give ChatGPT the keyword that you’re looking at. It’s automating part of that work.
I’m still going to keyword tools to compare the best time to now, to understand what happened. However, I don’t need to go to Google Analytics or another tool to find when that best time was or which month that page performed best. When you get into hundreds of pages, you save a lot of time.
The main idea is that you can use LLMs to write these pseudo-codes to do the boring tasks for you.”
What were you trying to discover when you were looking for related pages?
“In my case, I just wanted to add internal links. This was a way to easily find lots of pages. You can automate the process of adding the links, and talk with your engineers to do a bulk update.
I still look at the output to see whether or not the pages are actually related. Does it make sense in a user journey? When they finish this article, would that be the next step? It’s not perfect. You’re not going to blindly go ahead with whatever the tool gives you, but it’s much easier to spot-check. Do these titles make sense? Are these part of a normal journey that people would follow when they’re doing this?
If you’re in e-commerce, they might not want to just see 20 different headphones; they might want to see something different. There are ways to play with the code and prompt the tools to adapt to the things you need.”
Why is a Python script important?
“It’s just what you can use in Google Colab. You could probably use different types of scripts for different types of tools. This was just the process that I found. It’s very good for this kind of task.
I know how to read HTML, CSS, JavaScript, etc. However, if you tell me that something is broken and it’s not working, I have very little chance of fixing that. I managed to use LLMs to write code for me and the code works, even though I don’t know how to do code myself.”
Are you just using ChatGPT?
“I mostly use ChatGPT. I’ve done some troubleshooting with Gemini, which is connected to Google Colab. If you put a piece of code on Google Colab and it says that something’s broken so it can’t run the code, you can ask Gemini to help you fix it.
It took around 2 or 3 prompts and, with some back and forth, it eventually gave me the code that would do exactly what I needed.”
Should every SEO be doing this?
“Potentially, yes. It doesn’t need to be technical. One of my use cases was also looking at the peak time of a page – when it was trending the highest. Then I moved to keyword and SERP analysis. It wasn’t really a technical activity, but it saved me a lot of time.
I needed to look into thousands of pages, and using the tool meant that I didn’t need to go into every single screen on Google Analytics to find out when it happened. It had already done some of that work.
I’m sure there are plenty of other ways to automate things like this as well. Every SEO who is doing things that might be repetitive or boring should at least give it a try to see if you can automate at least part of that process.”
How did you decide what to automate first?
“Whatever challenge is coming to you right now. When I started looking at peak times, it was a manual process. After 5 or 6 pages of copying and pasting and making notes on a spreadsheet, half an hour had passed. I looked at the huge list I had to complete and realised that I was going to spend days doing this.
Can you do it in a better way? You want to get to the good part. You want to find out why those pages are not performing as they performed before, or why they’re performing so well versus a few months ago. That’s the really interesting part.
You’re not going to go back to your client or your manager and say, ‘I opened Google Analytics 500 times and loaded all these pages.’ They want to know what happened and why this is trending up or down. It frees up time so you can talk about the outcomes and not just copy things onto a sheet.”
If an SEO is struggling for time, what should they stop doing right now so they can spend more time doing what you suggest in 2025?
“Stop doing manual work, if you can automate things. This universe of LLMs is here to stay, whether you’re happy with it or not. You might as well embrace the things that you can embrace with it.
With that mapping task, it would have taken months for an editor to go through each page one by one. Instead of doing that manually, it took me two days to go over 20,000 pages. Someone is still checking them, but it’s much easier to start with a draft that you can work from. If something looks very off, then you go through it manually, check it, and replace things. In my experience, though, they were mostly in the right direction.
Automate the things that you can. If you don’t already have a tool to do what you need, give it a try and save yourself a bit of time.”
Gus Pelogia is Senior SEO Product Manager at Indeed, and you can find him over at GusPelogia.com.