The Best Wordle Starting Word

Wordle is one of those games that took our post-pandemic brains by storm. It captivated (and distracted) millions around the world, at a time when we needed it. I remember sharing my results with friends and family, and feeling a competitive spark to beat the puzzle in fewer guesses than them. So naturally, my engineering instincts kicked in as I challenged myself to find the best starting word.

The code I wrote for this article is available on GitHub.

A Wordle puzzle with "salet" entered as the first guess.

An Overview of Wordle

(Feel free to skip this section if you already know the rules.)

Wordle is a guessing game. Your task is to guess the random 5-letter word of the day, by entering one guess at a time, up to a total of 6 times. Each time you guess a word, any letters that your guess has in common with the answer will be revealed in either yellow or green: yellow if the letter is correct, but not the position, and green if both the letter and position are correct. Because each guess eliminates future possibilities, it's important to use a good starting word. For example, you wouldn't want to start with a word like ZULUS, as it has the uncommon letter Z, and a repeating vowel U.

A Starting Point

The first guess is the most important. The board is a blank canvas, waiting for you to start eliminating possibilities. In order to come up with the ideal starting word, we need to find a word that has the highest probability of having matching letters out of a randomly chosen 5-letter word. To do this, we can start by counting the frequencies of each letter in the set of 5-letter words:

Letter frequencies in all 5-letter English words

Those of you with a keen eye may have noticed that the first few letters match up closely with the starting letters in the final round of Wheel of Fortune R S T L N E. Though A appears to be more common than E in the chart above, this is only because we're limiting ourselves to 5-letter words. If we again count using all English words of any length, we find that E occurs roughly 27.2% more than A!

A scene from the final round of Wheel of Fortune.

Breaking Down the Results

A is the most common vowel, coming in at 8,392 uses. S is the most common consonant, clocking in at 6,537 appearances. The next most-common letter is R, at 5,143 uses. We could stop right here and come up with a word that contains all of the top letters: A E S O R such as AROSE. That's a pretty good starting word, but we can do better.

Why We Can't "Arise" to Victory

Why isn't AROSE the best word? After all, it uses all of the top five letters we just calculated. It turns out we overlooked one important detail: letter positions. If all letters had an equal 1/5 or 20% chance of appearing in any given position in a word, "arose" would be one of the best words. However, this is not the case. For example, S is almost 6% more likely to appear as the first letter than the fourth letter in a word. j

1
11.4%
2
1.1%
3
4.3%
4
4.5%
5
3.8%

The chances of the letter S appearing in each letter position of a word.

A
7.4%
R
7.2%
O
7.2%
S
4.5%
E
11.8%

The chances of each letter in "arose" appearing in their respective positions for a randomly selected word.

A Less Naive Approach

What if we assigned each word a score based on the combined total of each letter's likelihood of appearing in its respective position?

// A hash map of every letter along with its positional frequencies
const letterFrequencies = {'a': [3000, 1500, 500, 400, 1000], ...}

// Generates a score for each word based on letter frequencies
function calculateWordScore(word, letterFrequencies) {
  let score = 0;
  const keys = Object.keys(letterFrequencies);
  for (let i = 0; i < word.length; i++) {
    if (!keys.includes(word[i])) continue;
    score += letterFrequencies[word[i]][i];
  }
  return score;
}

Using this approach on every 5-letter word gets us these top five words:

WordScore
sanes11579
sales11401
sores11295
cares11268
bares11213

We're very close now. An astute observer might note that all of these words are pluralized. Unfortunately, there's a lesser-known rule of Wordle: they will never choose a pluralized word. Let's account for this simply by lowering the letter S score when it's in the fifth position of a word:

letterFrequencies.s[4] = 500;

Choosing 500 was a bit of trial and error. A more optimal approach would be to go through all 15,918 5-letter words and eliminate the plural ones. This is an exercise left for the reader. 😉

Adjusted Top-10 Scoring Words

WordScore
saree10610
soree10020
carey9805
sooey9442
siree9408
coree9403
boree9348
salet9343
saner9326
sayee9295

There's still a slight issue here. Some of these words are so uncommon that Wordle excludes them from its dictionary. In fact, the top 7 words are not guessable. That leaves us with SALET as our first guessable word. A word which means:

a light medieval helmet, usually with a vision slit or a movable visor.

How does this compare to our naive approach of "arose"?

S
11.4%
A
18%
L
6.7%
E
15.8%
T
6.8%
A
7.4%
R
7.2%
O
7.2%
S
4.5%
E
11.8%

Subtracting the differences between each letter, we get an average improvement of 4.1% per letter. This translates to a slightly higher chance of getting both the letter and position correct with "salet" vs "arose".

And just for fun, I calculated one of the worst starting words:

E
2.6%
T
2%
H
1.3%
Y
1%
L
5.4%

Conclusion and Potential Improvements

I hope you enjoyed reading about this challenge as much as I enjoyed writing about it. Using letter-frequency-analysis we figured out that SALET is the ideal starting word. What about future improvements?

Finding the Next Best Guess

What if we could feed the results of the first guess back into our program? One way to do this would be to use the matching and missed letters to reduce the number of available words, then recalculate the frequencies like we did for the first guess. Each subsequent guess will narrow down the potential words and have a new set of words with the highest letter and positional frequency probability.