Natural language processing and SEO – a practical demonstration
August 19, 2021
Natural language processing (NLP) can be a daunting subject, especially as you get into the weeds of it. That said, with just a basic understanding of what it is and how it works, it can be very useful – particularly for SEO. In this article we aim to provide a practical demonstration of basic NLP, […]

Share this post:

Natural language processing (NLP) can be a daunting subject, especially as you get into the weeds of it. That said, with just a basic understanding of what it is and how it works, it can be very useful – particularly for SEO.

In this article we aim to provide a practical demonstration of basic NLP, and some of its uses in the SEO industry – without getting bogged down by complex code.

What is NLP?

Natural language processing (NLP) is a branch of computer science which aims to use machines to understand human language – whether that be written text or spoken word.

Most of us will likely have experienced the benefits of NLP without even realising it. Think Google Translate, talking to Siri/Alexa, email spam detection, predictive text, chat bots, search engines, transcripts – the list goes on…

All these applications require a machine to read some text, take some ‘meaning’ from it and deliver an output – whether that be pushing an email into a spam folder, translating text, or another action.

The study of natural language processing has been around for more than 50 years, but has come into prominence in recent years thanks to the rapid growth in computer processing power, which makes it more accessible. It is a rich, complex and developing subject, with many challenges still to be perfected.

A quick example

Type in the box below and see we can ‘understand’.

Why is NLP important for SEO?

Google’s mission statement says: “Our company mission is to organize the world’s information and make it universally accessible and useful”. To enable them to do this, they first need to find the information which they want to organise (via web crawlers), and then make sense of what they find – which is where NLP comes in.

Google uses NLP to process the contents of web pages – whether that be written content, images or videos – with the ambition to gain some understanding of the meaning behind the content.

So, why is this important for SEO?

One part of this is being able to emulate how Google processes content, so we can make improvements at scale. An obvious example is to detect poor grammar, which Google would deem as low quality content, and therefore rank poorly. Another example might be to understand where there is thin or competing content on a website, since thin content or duplicated content is also deemed low quality. With this understanding we can crawl an entire website and easily determine content that needs improving.

The other side to the relationship between NLP and SEO is to help automate challenging, manual or time consuming jobs. This could be categorising keywords into groups based upon their meaning when doing keyword research, or generating boilerplate website content that needs to be slightly different depending upon a specific variable (such as the product, a location, an amenity or an attribute).

4 Practical NLP applications for SEO

Fuzzy matching

As SEOs we often need to match text content together and see how similar it is. We do this to help with:

  • Finding duplicate content / thin content
  • Redirect mapping – mapping an old URL to a new URL

Try typing two similar sentences into the boxes below to generate a score – the higher the score the more similar the content.

Score:

Why is this useful?

Thin content = bad content = poor rankings. Being able to determine thin content across large websites is very important.

Sentiment analysis

As SEOs we often need to determine the sentiment of text. We do this when we are looking at:

  • Online brand commentary
  • Product reviews

Try typing something positive into the box and then look at the emoji below, now swap your comment for something negative and see what happens…
Sentiment:

Why is this useful?

Negative brand or product reviews are likely to affect your search rankings, so spotting these and learning from them is key. If you don’t, and commentary about your brand which has a negative sentiment continues, the end result is likely a decrease in rankings. This would be in part due to Google flagging the negative content, but also due to an assumed decrease in brand search.

Keyword research

As SEOs we are often doing keyword research to find what users are searching Google for. This means we can produce the right content to target those searches. With billions of searches made a day, you can spend a lot of time sieving through variations of keywords, trying to determine what the user is looking for. By using NLP within keyword research you can automate:

  • Keyword catergorisation
  • Keyword clustering

Try putting some keywords into the box below and then press categorise to see how we can break them down and start to understand them in categories.

Why is this useful?

Categorising keywords means we know what people are researching for in relation to a broad keyword theme. Take our example, we can see what specific adjectives and verbs people are commonly searching for alongside the theme of trainers.

We now know that some of these more specific keywords could warrant a landing page being created to target them. For example:

  • https://www.example.com/trainers?colour=black
  • https://www.example.com/trainers/running-trainers
  • https://www.example.com/trainers/brand/nike

Content creation

As SEO’s we often need to produce content to target keywords used in searches. As NLP continues to develop, machines are better able to write this content for us. GTP-3 is the most advanced case study of this to date, and is nearly comparable with human written content. Some examples where computer generated copy can be used are:

  • Boiler plate copy
  • Articles

Check out our example below, but note that the content generated through the text box below uses a relatively simple algorithm called ‘Markov Chain’, which was published in the early 20th century. You’ll notice it won’t fulfil all your content needs, but it does illustrate how machines can generate content.


Why is this useful?

All websites need high quality, unique content to rank well on Google. Being able to create this content quickly and at scale is a game changer. This is the way the market is trending, with many software companies now providing advanced computer written content for use.

Can we help you?

At Melt we are experts in using NLP to assist with SEO keyword research and automation – drop us a line if we can help you.

*To produce these examples we have used two JavaScript libraries, namely:

  • Compromise JS – Natural Language Processing
  • Compendium JS – Natural Language Processing, Sentiment analysis




More posts