Melt Site Icons - Final
Scott Jackson
March 16, 2021

Tutorial: Simple keyword categorisation script for the travel industry

Keyword research can be a laborious task, particularly when you have to categorise keywords manually using Excel or Google spreadsheets.

Over the years, automating keyword categorisation has become increasingly easier to do with the likes of NLTK, and other language processing libraries, doing most of the heavy lifting.

In this post, I’ll give you a brief demo of a free keyword categorisation script that I’ve built using Google Colabs. The categorisation within the script is rather simplistic and uses a predefined dictionary (pre-inputted words and phrases) to categorise keywords (English keywords only).

Using a predefined dictionary effectively enables the script to ‘look out’ for keywords that fit into “parent” words or phrases, for instance, our dictionary may classify ‘cheap’ and ‘low cost’ into a category of ‘budget’. A predefined dictionary is a good starting point for keyword categorisation, but this will only get you so far. It won’t allow for categorisation of any obscure keywords as these are unlikely to be listed within our dictionary. But, with that said, it is a place to build from and will hopefully serve basic needs.

The script is built to be accessible to all users and shouldn’t require any coding knowledge. I should also say that I am still learning to code, and have only scratched the surface of what can be done. With that said, if you have any suggestions or ideas to make the script better, it would be great to hear from you.

Finally, this is a demo of how to categorise keywords using our script and does not go into detail about how you might go about building a categorisation script yourself. If this post is popular, we may release a tutorial on how to build your own script. We’ll also be releasing a follow on post to this script which will allow you to add your own dictionary.

What you’ll need for this tutorial:

  • A travel focused keyword dataset saved in a .csv file (in a similar format as below. Note, you can have as many/as few columns as you want, you just need a column with keywords)
  • A Google account (so you can access Google Colabs)

For those who haven’t used Google Colabs before, it basically allows you to run Python programmes through your browser. Google Colab is built on Jupyter notebook and works in the exact same way. More information can be found on Google’s site here.

Getting started is straightforward and requires two small steps to make sure that your data can be processed by our script. 

Step one is to have your keywords you want to process within an Excel file. Name the column that your keywords are in as ‘keyword’. Make sure your column name is in lowercase and is exactly as written in the previous sentence/screenshot below (any spaces, upper case or unusual characters will mean that our script will not recognise your column correctly). The rest of the columns within your dataset are fine to keep in as they are. 

The second step is to save your keywords within a CSV file. This is reasonably straightforward and all you need to do is hit ‘save as’, and then underneath the file name, they’ll be a drop down that says, ‘Save as type’. Here you’ll want to select ‘CSV (comma delimited)’ and then hit save.

Once you’ve got your CSV file ready, head over to our Google Colab script and create a copy in your drive. Once it’s saved, it should open a new copy automatically called ‘Copy of Simple keyword categorisation for travel’.

Note: If you don’t create your own copy, you’ll receive a warning message saying that the script is not authorised by Google. Don’t worry – we’re not accessing any of your files, however, it’s best to create your own copy so it’s stored within your personal drive.

To get started, simply go to the top of the Google Colab script and go to ‘Runtime’. Next hit ‘Run all’ which will run every section of code (for those familiar with Google Colabs/Jupyter notebook you can execute each section of code independently if you prefer).

Once the script starts, you’ll be prompted to upload your keyword file. Simply open the file that you’ve prepared, and the script should do the rest! Please note that the script may take a while to execute if you’ve got a large keyword set.

Once the script is complete, you’ll see a preview of the output along with some numbers which indicate that the particular section of code you’re looking at has been executed.

Another way to tell if the script is complete is to look at your tab in Google Chrome. If the Google Colabs symbol is gold, it means that the script has executed successfully. 

If it’s red, it means something has broken and you’ll need to review your keywords text file to make sure there are no unusual characters (for example accents (diacritics), ampersand symbols, hyphens, etc.).

Finally, if the Google Colab symbol is silver, it means the script is still running. If you have a large number of keywords this might take a bit of time to process. 

Once the script is complete, head over to the panel on the left and click on the folder button. Here you’ll be able to download the CSV file that we’ve just produced (called ‘categorised_keywords.csv’).

The output file should have appended 7 new columns (user, topic, modifier etc.) and might look something like the below screenshot (likely you’ll have more columns based on all the data your CSV had before):

This data can now be used in a number of ways; for example, you might want to group common themes or specific destinations together using a pivot table, which could help create a content calendar for your website. Or you may want to map keywords to existing landing pages for further optimisation opportunities.

Whatever your objective, grouping keywords together should allow you to summarise your data more effectively and help you understand what users are searching for around a particular topic or theme.

And that’s it! Keywords categorised for the travel sector. Feel free to message us on social media and don’t forget to like and share!

Comments are closed.

Previous Article
Insights: The most searched members of the royal family after the Meghan & Harry interview
Next Article
Digital PR Process – Step 3: Content creation