Build a recursive word finding algorithm with Python - Part 1

by Dan Duda

Requirements

  • Python 3.x
  • PyCharm Community (optional)
  • A text file of English words
  • Basic Python knowledge

Introduction

I’ve always loved word games like PopCap’s Bookworm series and Microsoft’s Wordament. I’m also am a programmer and love to hack. So it’s only natural that I’d want to develop a “solver” for word games. A lot of programming concepts can be learned in creating this type of application such as recursion, tree data structures, and optimization. In the following set of tutorials we’ll build an algorithm to find English words based on a set of letters either scrambled like the game Scrabble or grid of letters like in the game Boggle. We’ll start off with a basic data structure and then learn how to better store our data to make searching lightning fast.

Getting a list of words

Before we beging writing any code we’ll need a list of words to use for our lookups. Any large list of words should work but here is one I found, Click here and then click on the “Download” button to download a copy of the words list. If clicking on the “Download” button just brings up the list of words in the browser then just right-click on the list and choose “Save as” and save the “words_alpha.txt” to your computer. This list contains around 479k English words which will really test our algorithm’s performance.

Setting up our project folder

Create a new folder somewhere on your hard drive called “WordSearch”. Copy the “words_alpha.txt” file into that new folder. Add a new file named “words.py” to this folder. I’ll be using the free PyCharm IDE to write my code. You can download it here. Any code or text editor should work though. It is assumed you have python installed on your machine.

Reading the list of words

Open up the “words.py” and enter the following code.

1
2
3
4
5
6
7
8
9
10
f = open('words_alpha.txt', "r")
words = f.readlines()
f.close()

# Print count of words
print(len(words))

# Print first 10 words
for i in range(10):
print(words[i])

This will read the entire words file into a list named “lines”. We then print out the number of words in the list and the first 10 words.

I get the following output.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
370099
a

aa

aaa

aah

aahed

aahing

aahs

aal

aalii

aaliis

Let’s clean up the code a bit to strip off the extra carriage returns using a list comprehension. We’ll also use the string formatting syntax to add a label to our count.

1
2
3
4
5
6
7
8
9
10
f = open('words_alpha.txt', "r")
words = [l.rstrip() for l in f.readlines()]
f.close()

# Print count of words
print(f'Word Count: {len(words)}')

# Print first 10 words
for i in range(10):
print(words[i])

The output should now look like this.

1
2
3
4
5
6
7
8
9
10
11
Word Count: 370099
a
aa
aaa
aah
aahed
aahing
aahs
aal
aalii
aaliis

Searching the list of words

So now that we can read in and store a large list of words we’re going to want to search the list to check if a word exists. We can use Python’s built in “in” statement to check if an item exists in a list. Let’s also add the ability to enter a word or string of letters with a prompt.

1
2
3
4
5
6
7
8
9
10
11
12
f = open('words_alpha.txt', "r")
words = [l.rstrip() for l in f.readlines()]
f.close()

letters = input('Enter a string: ')
while letters != '':
if letters in words:
print('Word found')
else:
print('Word was not found')

letters = input('Enter a string: ')

Now when we run this we are prompted to enter a string. If the string is found in our list of words it prints, ‘Word found’ otherwise ‘Word was not found’. Just pressing enter without entering anything will end the program.

1
2
3
4
5
Enter a string: mother
Word found
Enter a string: motxxer
Word was not found
Enter a string:

In the next tutorial we’ll start on our recursive algorithm to find all words from a string of letters.

Goto Part 2