beanz Magazine

Chain of Command, Command of Language

Sam Carpenter on Flickr

How do computers predict what text you want to write next? Here's how to create predictive stories.

Predictive text is a technology you’re probably very familiar with from things like smartphones and tablets. Every time you type or choose a word, you get a choice of other words that can come next. How does a phone choose what text to suggest next?

In this article, you’ll learn a simple technique for making your own text predictors as well as how to generate text in the style of a particular author!

The trick to both is an old mathematical concept called a Markov chain. A Markov chain is basically a process that can have a number of “states” it can be in, and whose next “state” is determined randomly. Flipping a coin over and over is a Markov chain with a single state. What about a Markov chain with multiple states?

Picture a checkerboard, or grab/draw one if you’d like, and put a coin down on it. Roll a four sided die to determine which direction to move next: up, down, left, or right. If you can’t move a direction, then you stay where you are. This is a Markov chain where the states are the squares on the board.

How does a game like this help us generate text? Imagine a different kind of board, one where the squares represent the last several words you’ve seen and moving to another square means choosing a word that comes next.

There’s a lot of words in a language, though, many more than just the four directions in our checkerboard example. How do we choose which one comes next? If we choose any word with equal probability, on some imaginary ~250,000 sided die, we’d get things that don’t sound anything like English! “The a his the the fish” would be as equally likely a sentence as “The ball rolled down the hill”.

We don’t have to be experts in languages to figure out what the probabilities should be, though!

Instead, we can cheat a little and figure it out by taking some really large example of the language, like a novel, and go through the entire document counting how often we see combinations of words. What we’ll have at the end is the big table of the probabilities you need for choosing different words based on what the previous several words you’ve seen are.


Become a subscriber and get access to the rest of this article. Plus all our magazine articles.

Stories also include numerous links to help parents, kids, and teachers learn more. Get access today at just $15 per year!

Subscribe Today!

Also In The August 2017 Issue

A substitution cipher is an easy way to begin learning about how to use and make secrete codes.

Scratch is a fun block-based programming language that's easy to learn once you understand the basics.

The micro:bit is a not too expensive board that lets you easily build projects to learn about computing.

The humble sewing machine can be a great first step to fun maker projects. Here's how to get started!

There's lots you can do make your online experiences enjoyable AND safe.

Minecraft is a fun game to play and a way to learn about games and programming. But first you have to learn the basics.

Have you ever put books in alphabetical order? What do you think the best method of alphabetizing would be?

Some ideas how to engage young women in computing and STEAM based on recent research.

These three dimensional objects are 3D printed and cast images when light shines through them.

How do computers predict what text you want to write next? Here's how to create predictive stories.

Links from the bottom of all the August 2017 articles, collected in one place for you to print, share, or bookmark.

Interesting stories about computer science, software programming, and technology for February 2017.