beanz Magazine

Regex Game

Bureau of Land Management Alaska on Flickr

Learn how to search through blobs of text with speed, accuracy, and elegance… like a ninja!

From 1896 to 1899, over 100,000 men flocked to the Klondike during the gold rush. They filled large pans with gravel and dipped them in subarctic streams. By carefully shaking and stirring, they might — if they were lucky — find a few nuggets of gold. It was a long, frustrating, and often exhausting process.

Nowadays, many of us have a similar task to perform. Except we sift through spreadsheets and databases rather than clay and dirt, and we’re looking for information, not gold. Luckily we have tools that those Yukon explorers never had.

Using letters, numbers, and special symbols like ^, $, and +, a regular expression (regex) is designed to match a specific pattern. You could write a regex to highlight all the emails in a text document. You could pinpoint all the telephone numbers among thousands of other words.

The main advantage of regexes is speed. They’re faster than regular code searches, and often portable from one programming language to the next.

The Basics

Start by navigating to https://regexr.com/, a website designed to test regexes.

Replace the sample text with this fun poem about sharks by John Ciardi:


The thing about a shark is — teeth
One row above, one row beneath.

Now take a close look. Do you find
It has another row behind?

Still closer — here, I’ll hold your hat:
Has it a third row behind that?

Now look in and… Look out! Oh my,
I’ll never know now! Well, goodbye.

If you’ve left the sample expression untouched, a few words in the poem will be highlighted. Can you guess what they all have in common?

Regex Basics

1) Letters and numbers (e, ea)

A lowercase letter matches a lowercase letter. A capitalized letter matches a capitalized letter, and a number matches a number. Try replacing the expression on the webpage with ‘e’ or capital ‘I’. What happens if you type two letters together, like ‘ea’?

2) One or the other ([ea])

If you put those same two characters inside square brackets, the pattern with match either ‘e’ or ‘a’. This is useful when you need to match similar words, like cat, hat, and sat. In this case you could write the regex ‘[chs]at’.

3) Wildcard ( . , o.e)

Type a ‘.’ and the entire text is highlighted, including letters, punctuation, and spaces. Dots are wildcards. If you’re not sure which character you need to match, but you want to match something, wildcards come in handy. Try the sequence ‘o.e’.

4) Escaped characters (/s, /w, /W)

Escaped characters combine a backslash and a letter. ‘/w’ matches any letter, while ‘/W’ matches anything but letters. ‘\s’ matches whitespace, including spaces, tabs, and newlines. Escaped sequences are more specific than the basic wildcard, but they still offer plenty of flexibility.

5) Quantifiers (+, *, {2})

What if you want to match any four-letter word? ‘/w/w/w/w’ would work, but it’s a pain to write. Quantifiers are a convenient way to indicate how many times you want to repeat the previous character. ‘+’ means one or more. ‘ * ‘ is zero or more. Curly braces indicate a specific amount, such as ‘\w{4}’. Try comparing ‘te+’ with ‘te*’. Heads up — the star character can be unpredictable!

6) NOT (^, [^a])

To avoid a certain character at all cost, use ‘^’ and square brackets. For example, to match anything except the letter ‘a’, type: ‘[^a]’. Careful, though! ‘^’ has a different meaning outside the brackets.

Challenge

While there are other rules for regexes, this is a great first taste. See if you can apply what you’ve learned to create expressions that match the following patterns:

  1. Both ‘look’ & ‘Look’
  2. All 3-letter words that end with ‘w’
  3. All punctuation marks
  4. All words that are 4 letters or longer

Answers

With regexes, there’s always more than one correct answer. Your solution may be different, but if it matches what it’s supposed to match, you’re golden!

  1. [Ll]ook
  2. ..w
  3. [^\s^\w]
  4. \w*\w{4}

Learn More

Introduction to Regular Expressions

https://www.kidscodecs.com/regular-expressions/
https://scotch.io/tutorials/an-introduction-to-regex-in-python
http://codular.com/regex

Regex Crossword, a crossword puzzle with regular expressions

https://regexcrossword.com/

The regular expression game

http://play.inginf.units.it/#/

Article about the importance of Regular Expressions

https://www.theguardian.com/technology/2012/dec/04/ict-teach-kids-regular-expressions

Video tutorial about regular expressions

https://www.youtube.com/watch?v=7DG3kCDx53c

The Poem, Teeth of Sharks

https://www.poetryfoundation.org/poems/49771/about-the-teeth-of-sharks

Also In The April 2018 Issue

Logic puzzles help develop reasoning skills useful for programming, computer science, and anything you might do.

Find perfect and fun gifts for your loved ones that teach STEAM concepts and skills.

From light-up bow-ties to conductive thread, you’ll be the life of the party with this STEAM-inspired gear.

A free online test service reveals how much personal data your web browser is giving away.

Add more tools to your command line arsenal, including running mini-scripts and making backup copies.

Use switches to take your robotic creations to the next level.

An old classic with a electronic twist, featuring JavaScript and micro:bit.

Create the American flag in SketchUp using this detailed tutorial.

From lasers to supernovas, Berboucha is making science communication a priority.

Code can always be improved. Check out these tips to make you the best programmer you can be!

It’s a programming language unlike any you’ve seen before. Check out this symbolic system designed for mathematical calculations.

New, improved, faster, and sleeker - it’s Scratch 3, your new favourite block language!

Learn about the brilliant algorithm behind all of your GPS devices.

It’s free, comprehensive, and available on-the-go. This cool app helps you master Python faster than ever before.

Open up whole new worlds to explore through these interesting, diverse add-ons.

Links from the bottom of all the October 2018 articles, collected in one place for you to print, share, or bookmark.

Interesting stories about computer science, software programming, and technology for October 2018.