Solving Wordle on Hard Mode
January 25, 2025Back in 2022, when Wordle first came out. I and many others built some scripts to attempt to find the best starting word and the best guessing sequence. The central question was whether wordle could be solved at all. There's an interesting constrained optimization problem in Wordle, and so I and many others took it as a coding challenge, and tried to figure it out.
To those who might have somehow missed it SKIP, Wordle is a word guessing game on a webpage; it's now owned by the New York Times and is still free to play.
Each day a secret 5-letter word is chosen by the editor. Your job as a player is to guess what the secret word is. If you can guess it in 6 or fewer guesses, you win. If you can't get it in 6, well, you lose.
You're given a hint after each guess; each letter in your guess is marked in green if it was the right letter, in yellow if the letter was in the secret word but in the wrong position, or in dark grey if the letter is not in the secret word, or if you had too many.
It had a very clever sharing feature, where you could share your number of guesses and hints without spoiling the game for others. This helped it become super viral on Twitter, and from there the rest of the internet was infected.
You might want to go try it.
An interesting problem
So after trying this game for a few months, I wanted to know whether it was possible to play without losing. There were a few flaws in the original website, for example you could use View Source to look at the Javascript and find the word list or reverse engineer the shuffling algorithm and figure out the secret word without guessing. But that wasn't very interesting, and it was quite fragile. I wanted to know whether you could play the game without losing.
So I wrote some terrible python code.
Solving this game turned out to be pretty interesting to me. I ended up writing a algorithm for it. Very roughly, the algorithm goes something like this:
0. Check for preconditions
0.0 Did we lose? (Did we run out of guesses?)
0.1 Did we solve it? (Is there only one answer remaining)
0.2 Can we solve it? (Are there more guesses remaining than we can guess?)
1. for each word in the valid guess list
1.1 Bucket the answers depending on potential scorelines
1.2 Find a guess which results in the smallest maximum size bucket
1.3 For each bucket recurse down into 0 and get the total for that word.
1.4 Add the score to the total for the word
2. If the total score is less than the current best, replace the best guess
and best total score.
3. Return the best total score.
My first solver would basically try to get the flattest guessing possible. It worked pretty well at what I wanted it to do, which is find any solution to the game, but it has issues. It doesn't do very well pruning out equivalent solutions and it was absolutely terribly slow. But I got a unique starting word, AESIR, and I had a neat guessing tree to train myself on. Eventually I switched to RAISE, just so my second word would be CLOUT and I could occasionally RAISE CLOUT as the situation needed. Wordle is pretty lenient.
Incidentally, I read several other blogs, and watched other similar people solving the same problem. I wasn't the only person who wondered whether the could be solved. 3brown1blue worked out some heuristics based on information theory, but stopped short of the full guessing tree. Several other YouTubers got in on the fun, too.
The best breakdown and solution I came across was Jonathan Olsen's wordle solution. His was, and still is, objectively better than mine, but very similar.
I broke up with Python over this
Anyway, this program took weeks to run. I wrote it in Python and it took forever upon forever. It ate RAM. I went through more and more optimizations, including sampling the guess list in order to make the bucketing process faster, caching the guess scoreline comparisons to save on RAM use and make the bucketing process faster.
Worst of all, I couldn't exactly share it with anyone.
I looked at my past projects, and I consider all of them to be pretty limited in reach for a couple reasons:
- If I am working on some computational project, Python is a terrible choice. There isn't any multithreading performance gain because of the GIL, and the language is interpreted, not compiled. So you waste performance in a couple different ways. You never really claw it back for anything interesting, either.
- If I ever wanted to share a project with the world, I should just use Javascript. This would mean a browser-based UI, users can fork it, play with the results immediately, etc.
- For combinations of the two, I can run a web server.
The only exception where I might use Python going forward is ML/AI projects, where Python, scipy and numpy are still ruling the roost.
Is Wordle Solveable?
Wordle is a very solveable game. That is, if you're given a list of all legal 5-letter words, you can come up with a strategy that never loses. There are a couple of rules that help ensure that it is solveable:
- There are no plurals ending with 'es' or 's'
- There are few past-tense verbs ending with 'ed'
- There are no proper names, but some names which are also words (i.e. peter) are allowed.
- Several offensive words are banned.
Why am I writing about this now?
I recently lost my 500-day streak on Easy mode in Wordle, and I wanted to switch to Hard. I needed a new starting word and guessing tree, so I've rewritten my Wordle program in Go.
This new program turns my computer into a convenient space heater, but can
solve hard mode (which is computationally easier?) in a few minutes for a
slightly expanded dictionary. You can grab a copy for yourself on
my github,
or by requesting it with git clone git://git.glaka.us/wordle-tree.git
.
I found 39 solutions of varying quality, available as JSON files here:
alert alien aline alone arles artel carse earst lares least liane litre parse parte prate ratel reast reist rinse roast sacre salet scare seral slane snare stale stare stire store strae tares tarse teras toile torse trade trice trine
It's not perfect, I think a more traditional branch-and-bound where I use a heuristic and a priority queue could solve even faster, but it's not currently architected for that, and I'm done with the game for a while. My next solver should be JavaScript I think.