How to crack Wordle with data in 5 minutesAt face value, Wordle seems like a game heavily influenced by luck. You might even have a superstitious starting word that you swear by. But, whilst luck can certainly play a part, we’ve instead taken a data driven approach to minimize that luck and ‘crack’ Wordle consistently. We’ve even given you a best starting word based on some clever bits of code.
A naïve first approachWhen solving a problem like this, it’s good to start by prototyping the simplest possible solution. Doing this, we can gain learnings along the way, iterating upon it until we arrive at something gradually more complex. In this case, the first thing that we need to do is suggest the starting word. The simplest solution here is to just pluck at random a word from our list of valid guesses and use that. It’s not the best solution, but the point here is to get something that works first, and refine later. Now that we’ve made our first guess, Wordle will let us know how we’ve done. Let’s see an example…
In this case, we have an orange ‘R’, and a green ‘E’. To select the next word, we can again pick a random word from the list of possible guesses. This time, however, we know that not all of the words in the possible guesses are still possible. For example, ‘Crown’ isn’t a possible guess anymore, since we know that the correct word ends in an ‘E’, so we can strike ‘Crown’ from the list. We can do this for every word that is no longer possible, so every word with a grey letter in it, every word without the green letter at the correct position, and every word not including all of the orange letters is crossed off. This cuts down the list by far more than you might think. In this example, we go from 10,657 possible guesses for our first random choice to just 133 possible guesses for our second. Even if every letter in ‘store’ were to come back grey, our list of 10,657 possible words would be cut down to just 856, so don’t lose hope when you completely ‘miss’ your first guess! From here, we can continue this loop until we’ve got the answer or run out of guesses. Proving that just this basic solver will in practice complete Wordle in 6 or less tries more often than not, just by sheer elimination.
Side note: choose better passwordsAs an aside, this actually highlights why some websites will put requirements on password strength, such as requiring 12 or more characters or no common phrases. A 12 character password using symbols, upper case characters, and all the other hoops you might be asked to jump through would take roughly 34,000 years to crack – with current top-end processing power. On the other hand, if you use the word ‘password’, the list of possible passwords has been drastically reduced as now only 8 characters need to be found. The computer will be able to crack that password in less than a second. Slightly faster!
Refining our initial ideasGetting back to Wordle, this section will get a little more mathsy and a bit more technical. We’ll take you through the improvements we can make to the first solver by using a little bit of statistics and clever programming. So, if you’re up for the challenge, keep reading! The first thing that stands out as needing improvement is our method of selecting the next guess. We’re currently just selecting a random word out of all possible guesses, but surely some words are better than others? Our goal should be to rank each valid guess and select the best one each time. For this ranking, we’re going to be leveraging a concept known as ‘Information Gain’. That is, how much would each possible word reduce the number of valid guesses left in the next round on average? How much information will we gain by selecting a specific word? We know that getting as many greens and oranges as possible reduces the number of valid guesses the most, so we are looking to find the words that on average will get us the most of those. To find out how many greens and oranges a word will get on average, we take each word in our remaining valid guesses and compare it to every other possible solution. We calculate how many greens and oranges our guess would get compared to the Wordle of the day. For simplicity’s sake, we’ll say that scoring a green is twice as good as scoring an orange, so we assign 2 points for a green, 1 point for an orange, and 0 points for a grey. This might be hard to conceptualise, so let’s run through an example…
Let’s imagine a scenario in which there are only 3 possible guesses to choose from, and we know one will be the correct answer. We have ‘idiom’, ‘strop’ and ‘donor’. For each word in our valid guesses, we compute the information gain against all other valid guesses. For ‘idiom’, the information gain against ‘strop’ is 2, from the single green ‘o’ they share. For ‘idiom’ against ‘donor’, the information gain is 3; two points from the green ‘o’ and one from the orange ‘d’. The average information gain is (2+3)/2 = 2.5. See if you can work out the average information gain for the other two… …The answers are that ‘strop’ gives an average information gain of 2.5, and ‘donor’ gives an average information gain of 3. Which means ‘donor’ has the highest information gain, and, on average, will get us the most greens and oranges. So our algorithm will select ‘donor’. Each time we want to make a guess for the next word, we follow the same process and rank each valid guess to find the word with the highest information gain score. We choose that, as we know that mathematically it should lead us to the correct answer in the shortest amount of time.